G - Type of a GroupK - Type of a KeyT - Type of a Valuepublic abstract class BatchingStrategy<G,K,T> extends Object
BatchingStrategy helps build "batching clients" in ParSeq. "Client" means an object that given K key
provides a task that returns T value. "Batching" means that it can group together keys to resolve values
in batches. The benefit of this approach is that batching happens transparently in the background and user's code
does not have to deal with logic needed to implement batching.
Example of a batching client might be ParSeq client for a key-value store that provides batch get operation. For the sake of simplicity of the example we are using dummy, synchronous key-value store interface:
interface KVStore {
String get(Long key);
Map <Long, String> batchGet(Collection <Long> keys);
}
We can then implement a BatchingStrategy in the following way:
In above example there is an assumption that all keys can be grouped together. This is why methodpublic static class BatchingKVStoreClient extends BatchingStrategy<Integer, Long, String>{ private final KVStore _store; public BatchingKVStoreClient(KVStore store) { _store = store; }@Overridepublic void executeBatch(Integer group, Batch<Long, String>batch) { Map<Long, String>batchResult = _store.batchGet(batch.keys()); batch.foreach((key, promise)->promise.done(batchResult.get(key))); }@Overridepublic Integer classify(Long entry) { return 0; } }
classify()
trivially returns a constant 0. In practice classify() returns a group for a key. Keys that have
the same group will be batched together.
The interaction between ParSeq and BatchingStrategy is the following:
batchable(String desc, K key) is invoked to create Task instanceEngine.run()batchable(String desc, K key) is started, the key K is remembered by a BatchingStrategyBatchingStrategy will be invoked to run batchable operations:
K key is classified using classify(K key) methodG group returned by previous stepexecuteBatch(G group, Batch<K, T> batch) is invoked for every batchexecuteBatch(G group, Batch<K, T> batch) invocations are executed
in the context of their own Task instances with description given by getBatchName(G group, Batch<K, T> batch).
Implementation of BatchingStrategy has to be fast because it is executed sequentially with respect to tasks belonging
to the plan. It means that no other task will be executed until BatchingStrategy completes. Typically classify(K key)
is a synchronous and fast operation whilst executeBatch(G group, Batch<K, T> batch) returns quickly and completes
promises asynchronously.
SimpleBatchingStrategy,
TaskBatchingStrategy| Modifier and Type | Field and Description |
|---|---|
static int |
DEFAULT_MAX_BATCH_SIZE |
| Constructor and Description |
|---|
BatchingStrategy() |
| Modifier and Type | Method and Description |
|---|---|
com.linkedin.parseq.Task<T> |
batchable(K key)
This method returns Task that returns value for a single key allowing this strategy to batch operations.
|
com.linkedin.parseq.Task<T> |
batchable(String desc,
K key)
This method returns Task that returns value for a single key allowing this strategy to batch operations.
|
abstract G |
classify(K key)
Classify the
K Key and by doing so assign it to a G group. |
abstract void |
executeBatch(G group,
Batch<K,T> batch)
This method will be called for every
Batch. |
protected void |
executeBatchWithContext(G group,
Batch<K,T> batch,
com.linkedin.parseq.Context ctx) |
BatchAggregationTimeMetric |
getBatchAggregationTimeMetric() |
String |
getBatchName(G group,
Batch<K,T> batch)
Overriding this method allows providing custom name for a batch.
|
BatchSizeMetric |
getBatchSizeMetric() |
int |
keySize(G group,
K key)
Overriding this method allows specifying size of the key for a given group.
|
int |
maxBatchSizeForGroup(G group)
Overriding this method allows specifying maximum batch size for a given group.
|
public static final int DEFAULT_MAX_BATCH_SIZE
public com.linkedin.parseq.Task<T> batchable(String desc, K key)
desc - description of the taskkey - keypublic com.linkedin.parseq.Task<T> batchable(K key)
key - keypublic BatchSizeMetric getBatchSizeMetric()
public BatchAggregationTimeMetric getBatchAggregationTimeMetric()
public abstract void executeBatch(G group, Batch<K,T> batch)
Batch.
Implementation of this method must make sure that all SettablePromise contained in the Batch
will eventually be resolved - typically asynchronously. Failing to eventually resolve any
of the promises may lead to plan that never completes i.e. appears to hung and may lead to
a memory leak.group - group that represents the batchbatch - batch contains collection of SettablePromise that eventually need to be resolved - typically asynchronouslyprotected void executeBatchWithContext(G group, Batch<K,T> batch, com.linkedin.parseq.Context ctx)
public abstract G classify(K key)
K Key and by doing so assign it to a G group.
If two keys are classified by the same group then they will belong to the same Batch.
This method needs to be thread safe.key - key to be classifiedpublic int maxBatchSizeForGroup(G group)
group - group for which maximum batch size needs to be decidedpublic int keySize(G group, K key)
group - groupmaxBatchSizeForGroup(Object)public String getBatchName(G group, Batch<K,T> batch)
batch - batch to be describedgroup - group to be describedCopyright © 2018. All rights reserved.