For example, when BigTable receives a request to load data from a disk, it needs to decide whether to cache or not cache a particular block. Obviously, if the client is doing a sequential scan of the data, then he might never re-access that block, but if he is doing random access of the data, he will likely access the same block in the future. For example, jobs like MapReduce are very likely to do a sequential scan. We can not hard-code this type of information into the heuristics, but a learned system might actually take this information into account(e.g. the job name and the user).