Advanced Programming: working with cache.

Advanced Programming: working with cache.

Let's say the next example. We have two tables, countries and cities, and they are related as

select * from countries
    inner join cities on countries.idcountry=cities.idcities

And we want to cache the result.

What is our motivation?

Our motivate is because the fastest query to the database is the query that it is not executed.

In this case, we want to cache the information it once, store it in memory, and use it many times. So, we could use the same data many times, but we are querying to the database once. However, we should also consider that the database also has its own cache, but it is not managed by us, so we don't know what is cached, when or where, and usually, it is flushed automatically.

Alt Text

Invalidating cache.

The main problem with cache is the invalidation.

For example, what if we cache the query?

select * from countries; -- 100 countries

And later, we decide to add a new country (South Sudan, for example). However, our cache doesn't know that there is a new country, so we should invalidate the whole cache.

select * from countries; -- 100 countries (from cache)
-- we invalidate the cache
select * from countries; -- 101 countries (from the database)

Invalidating via ID

Alt Text

This form of invalidation invalidates a specific query. However, we need (in some ways), to know which cache to invalidates, so it is not practical if we want to invalidate it manually. It is usually used internally.

How is it done?

select * from countries; 

We could use the query as an ID, but it is unpractical. Instead, we could generate a hash (sha256, for example)

and it is our id

286c960d1c9c21e84df5a263bddcd2771da976fa306cfdb3fdea60e174a680a8

While it is not shorter but it always has the same length, even if the query is long or more complex, so it could be used as a key.

But, what if the query has arguments?.

select * from countries where idcountry=?; 

Then, we also store the arguments and generates a hash, including the arguments. So, our hash contains all the information for the query: the query and the arguments (if any)

So then

select * from countries; -- it reads from the database and store in the cache
-- here we invalidate the cache id 286c960d1c9c21e84df5a263bddcd2771da976fa306cfdb3fdea60e174a680a8
select * from countries; -- since the cache is gone, then it reads again from the database and store in the cache

Invalidating via TTL Time to live

One of the ways to invalidate the cache is via TTL. So, our cache has an expiration date. It has some pros and cons.

  • The pro is it is easy to deal with invalidation because it works automatically.
  • The cons, if we give a short life, we should use the database more often. For example, if we are querying the data every minute and hour TTL expires every 2 minutes, our efficiency is 50% (1 out of 2 calls use the cache).
  • The other cons, if we give it a long life, then we could read outdated information.
select * from countries; -- we store it for (example) 5 seconds
-- (we wait 3 seconds)
select * from countries; -- it reads from the cache that it is still alive
-- (we wait another 3 seconds)
select * from countries; -- it reads from the database because the cache expired.

Invalid via family or group

Alt Text

Another way to invalidate cache is via a family or group of the cache.

select * from countries; -- stored in the family called "countries" 
-- countries=['select * from countries']

and

select * from countries where idcountry=56; -- also stored in the family called "countries"
-- countries=['select * from countries','select * from countries where idcountry=56']

Then we decide to add a new country, so we could invalidate the whole family "countries", and this family countries invalidates both queries.

However

It is our initial query.

select * from countries
    inner join cities on countries.idcountry=cities.idcities

This query could depend on two families, "countries" and "cities". So, this query could be invalidated by two families, if we add a new country or if we add a new city.

Implementing

We will use two libraries.

EFTEC/PdoOne: Pdo One (github.com)

EFTEC/CacheOne: CacheOne is a cache of service for php (github.com)

You can add it manually or via composer.

composer require eftec/pdoone
composer require eftec/cacheone

CacheOne works with Redis (recommended), the PHP extension APCU, Memcache or the file system.

Connecting to the database and the cache system

// You must add the libraries prior to be used
$cache=new CacheOne('redis',127.0.0.1,'redistest',6379); // if you don't have redis, then you could use any other.
$cache->setSerializer('json-object'); // the information is stored as json-object.
$pdoOne=new PdoOne('mysql',127.0.0.1,'root','abc.123','dbtest');
$pdoOne->setCacheService($cache); // di
$pdoOne->open(); // open database

Caching information

$countries=$pdoOne->useCache(3600)
    ->select('*')
    ->from('countries')
    ->toList(); // cache lasts 1 hour
$countries=$pdoOne->useCache(3600,'countries')
    ->select('*')
    ->from('countries')
    ->toList(); // cache lasts 1 hour or until the family of cache "countries" is purged.
$countries=$pdoOne->useCache(3600,'*') // indicates to use as family the table indicated in methods from() and joins().
    ->select('*')
    ->from('countries')
    ->toList(); // cache lasts 1 hour or until the family of cache "countries" is purged.
$countries=$pdoOne->useCache(3600,'*') // indicates to use as family the table indicated in methods from() and joins().
    ->select('*')
    ->from('countries')
    ->innerjoin('cities on countries.idcountry=cities.idcountry')
    ->toList(); // cache lasts 1 hour or until the family of cache "countries" or "cities" is purged.

Invalidating cache

We could invalidate cache using two methods, invalidating via the method invalidateCache() where we could invalidate via ID or via a family of cache.

The second method is to use cache and runs one of the DML operations (insert, update, and delete). It purges the family of cache automatically.

$pdoOne->invalidateGroup('','countries'); // we invalidate all the cache of the family countries
// or we could invalidate for insert,update or delete
$pdoOne->useCache(3600,'*')->from("countries")->set(['idcountry,name'],[101,'South Sudan'])->insert();

PSR-6

Why not follow the PSR-6 Standard?

  • First, it is not an official standard. An official standard is a standard written and followed by PHP. But it is not the main problem.
  • Second, the specification lacks of details whole it adds statistics that it could be useful but they are 100% operative.

For example, our initial exercise.

select * from countries
    inner join cities on countries.idcountry=cities.idcities

PSR-6 allows saving the cache as

$result=[..]; // storing the result of the query
$item=new CacheItemClass('KEYCACHE',result,500); // implements CacheItemInterface 
// configure item
CacheService::save(CacheItemInterface $item);

But what if we modify the table countries or the table cities. Maybe it doesn't affect the query but we don't know really.

We could invalidate the cache if we know the id

CacheService::deleteItem('KEYCACHE');

But how we related the "insert into countries" or "insert into cities" with KEYCACHE?

We could use the cache directly via PdoOne (persistence) or using the library CacheOne

Storing the cache

$result=[..]; // storing the result of the query
$cache->set(['countries','cities],"KEYCACHE",result,500);

Invalidating the cache

$cache->invalidate('KEYCACHE');
// or
$cache->invalidateGroup('countries'); // it invalidates all the cache stores in the group of family countries
// or 
$cache->invalidateGroup('cities'); // it invalidates all the cache stores in t

PSR-6

Why not follow the PSR-6 Standard?

  • First, it is not an official standard. An official standard is a standard written and followed by PHP. But it is not the main problem.
  • Second, the specification lacks of details, it adds statistics that it could be useful but it lacks other operatives commands.

For example, our initial exercise.

select * from countries
    inner join cities on countries.idcountry=cities.idcities

PSR-6 allows saving the cache as

$result=[..]; // storing the result of the query
$item=new CacheItemClass('KEYCACHE',result,500); // implements CacheItemInterface 
// configure item
CacheService::save(CacheItemInterface $item);

But what if we modify the table countries or the table cities. Maybe it doesn't affect the query but we don't know really.

We could invalidate the cache if we know the id

CacheService::deleteItem('KEYCACHE');

But how we related the "insert into countries" or "insert into cities" with KEYCACHE?

We could use the cache directly via PdoOne (persistence) or using the library CacheOne

Storing the cache

$result=[..]; // storing the result of the query
$cache->set(['countries','cities],"KEYCACHE",result,500);

Invalidating the cache

$cache->invalidate('KEYCACHE');
// or
$cache->invalidateGroup('countries'); // it invalidates all the cache stores in the group of family countries
// or 
$cache->invalidateGroup('cities'); // it invalidates all the cache stores in t

About Jorge Castro

Currently: Entrepreneur and Private Consultant
Civil Engineer in Informatics - USACH Chile.
Master in Business Administration (MBA) CEPADE Spain
Microsoft Certified Professional
Oracle Certified Associate
ScrumMaster Certified
Former developer
Former Project Manager

Related Posts