Skip to content

Commit c92b21b

Browse files
authored
Merge pull request #6 from emdgroup/feat/parallel-pagination
release
2 parents 9decbc8 + 9353775 commit c92b21b

File tree

8 files changed

+1007
-574
lines changed

8 files changed

+1007
-574
lines changed

README.md

Lines changed: 139 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ Features:
1010
* Works with TypeScript type guards natively
1111
* Ensures a minimum number of items when using a `FilterExpression`
1212
* Compatible with AWS SDK v2 and v3
13+
* Supports pagination over segmented [parallel scans](#parallel-scans)
1314

1415
Pagination in NoSQL stores such as DynamoDB can be challenging. This
1516
library provides a developer friendly interface around the DynamoDB `Query` and `Scan` APIs.
@@ -25,6 +26,19 @@ to JSON-encode the `LastEvaluatedKey` attribute (or even the whole query command
2526
The token is sent to a client which can happily decode the token, look at the values for the
2627
partition and sort key and even modify the token, making the application vulnerable to NoSQL injections.
2728

29+
**How is the pagination token encrypted?**
30+
31+
The encryption key passed to the paginator is used to derive an encryption and a signing key using an HMAC.
32+
33+
The `LastEvaluatedKey` attribute is first flattened by length-encoding its datatypes and values. The
34+
encoded key is then encrypted with the encryption key using AES-256 in CBC mode with a randomly generated IV.
35+
36+
The additional authenticated data (AAD), the IV, the ciphertext and an int64 of the length of the AAD are
37+
concatenated to form the *message* to be signed.
38+
39+
The encrypted and signed pagination token is then returned by concatenating the IV, ciphertext and the
40+
first 16 bytes of the HMAC-SHA256 of the *message* using the signing key.
41+
2842
> "Dance like nobody is watching. Encrypt like everyone is."
2943
> -- Werner Vogels
3044
@@ -47,7 +61,7 @@ const paginateQuery = Paginator.createQuery({
4761
});
4862

4963
const paginator = paginateQuery({
50-
TableName: 'mytable',
64+
TableName: 'MyTable',
5165
KeyConditionExpression: 'PK = :pk',
5266
ExpressionAttributeValues: {
5367
':pk': 'U#ABC',
@@ -104,7 +118,7 @@ for await (const user of paginator.filter(isUser)) {
104118
## Paginator
105119

106120
The `Paginator` class is a factory for the [`PaginationResponse`](#PaginationResponse) object. This class
107-
is instantiated with the 32-byte encryption key and the DynamoDB document client (versions
121+
is instantiated with a 32-byte key and the DynamoDB document client (versions
108122
2 and 3 of the AWS SDK are supported).
109123

110124
```typescript
@@ -123,6 +137,40 @@ const paginateScan = Paginator.createScan({
123137
});
124138
```
125139

140+
### Parallel Scans
141+
142+
This library also supports pagination over segmented parallel scans. This is useful when you have a large
143+
table and want to parallelize the scan operation to reduce the time it takes to scan the whole table.
144+
145+
To create a paginator over a segmented scan operation, use `createParallelScan`.
146+
147+
```typescript
148+
const paginateParallelScan = Paginator.createParallelScan({
149+
key: () => Promise.resolve(crypto.randomBytes(32)),
150+
client: documentClient,
151+
});
152+
```
153+
154+
Then, create a paginator and pass the `segments` parameter.
155+
156+
```ts
157+
const paginator = paginateParallelScan({
158+
TableName: 'MyTable',
159+
Limit: 250,
160+
}, { segments: 10 });
161+
162+
await paginator.all();
163+
```
164+
165+
The scan will be executed in parallel over 10 segments. The paginator will return the items in the order
166+
they are returned by DynamoDB which might deliver items from different segments out of order. Refer to the
167+
following waterfall diagram for an example. The parallel scan was executed over a high-latency connection
168+
to better illustrate the variability in the requests and responses. Even though the `Limit` is set to 250,
169+
DynamoDB will return on occasion less than 250 items per segment. The paginator will continue to request
170+
items until all segments have been exhausted.
171+
172+
![parallel scan](img/waterfall.svg)
173+
126174
## Constructors
127175

128176
### constructor
@@ -139,6 +187,45 @@ Use the static factory function [`create()`](#create) instead of the constructor
139187

140188
## Methods
141189

190+
### createParallelScan
191+
192+
`Static` **createParallelScan**(`args`): <T\>(`scan`: `ScanCommandInput`, `opts`: [`PaginateQueryOptions`](#PaginateQueryOptions)<`T`\> & { `segments`: `number` }) => `ParallelPaginationResponse`<`T`\>
193+
194+
Returns a function that accepts a DynamoDB Scan command and return an instance of `PaginationResponse`.
195+
196+
#### Parameters
197+
198+
| Name | Type |
199+
| :------ | :------ |
200+
| `args` | [`PaginatorOptions`](#PaginatorOptions) |
201+
202+
#### Returns
203+
204+
`fn`
205+
206+
▸ <`T`\>(`scan`, `opts`): `ParallelPaginationResponse`<`T`\>
207+
208+
Returns a function that accepts a DynamoDB Scan command and return an instance of `PaginationResponse`.
209+
210+
##### Type parameters
211+
212+
| Name | Type |
213+
| :------ | :------ |
214+
| `T` | extends `AttributeMap` |
215+
216+
##### Parameters
217+
218+
| Name | Type |
219+
| :------ | :------ |
220+
| `scan` | `ScanCommandInput` |
221+
| `opts` | [`PaginateQueryOptions`](#PaginateQueryOptions)<`T`\> & { `segments`: `number` } |
222+
223+
##### Returns
224+
225+
`ParallelPaginationResponse`<`T`\>
226+
227+
___
228+
142229
### createQuery
143230

144231
`Static` **createQuery**(`args`): <T\>(`query`: `QueryCommandInput`, `opts?`: [`PaginateQueryOptions`](#PaginateQueryOptions)<`T`\>) => [`PaginationResponse`](#PaginationResponse)<`T`\>
@@ -235,7 +322,7 @@ ___
235322
Object that resolves an index name to the partition and sort key for that index.
236323
Also accepts a function that builds the names based on the index name.
237324

238-
Defaults to ```(index) => [`${index}PK`, `${index}PK`]```.
325+
Defaults to ```(index) => [`${index}PK`, `${index}SK`]```.
239326

240327
___
241328

@@ -246,7 +333,7 @@ ___
246333
A 32-byte encryption key (e.g. `crypto.randomBytes(32)`). The `key` parameter also
247334
accepts a Promise that resolves to a key or a function that resolves to a Promise of a key.
248335

249-
If a function is passed, that function is only called once. The function is called concurrently
336+
If a function is passed, that function is lazily called only once. The function is called concurrently
250337
with the first query request to DynamoDB to reduce the overall latency for the first query. The
251338
key is cached and the function is not called again.
252339

@@ -275,37 +362,25 @@ items after the end of the query is reached or the provided [`limit`](#limit) pa
275362

276363
## Properties
277364

278-
### consumedCapacity
279-
280-
**consumedCapacity**: `number`
281-
282-
Total consumed capacity for query
283-
284-
___
285-
286365
### count
287366

288367
**count**: `number`
289368

290369
Number of items yielded
291370

292-
___
293-
294-
### requestCount
371+
## Accessors
295372

296-
**requestCount**: `number`
373+
### consumedCapacity
297374

298-
Number of requests made to DynamoDB
375+
`get` **consumedCapacity**(): `number`
299376

300-
___
377+
Total consumed capacity for query
301378

302-
### scannedCount
379+
#### Returns
303380

304-
**scannedCount**: `number`
381+
`number`
305382

306-
Number of items scanned by DynamoDB
307-
308-
## Accessors
383+
___
309384

310385
### finished
311386

@@ -340,6 +415,30 @@ or index that you are querying. The token length is at least 42 characters.
340415

341416
`undefined` \| `string`
342417

418+
___
419+
420+
### requestCount
421+
422+
`get` **requestCount**(): `number`
423+
424+
Number of requests made to DynamoDB
425+
426+
#### Returns
427+
428+
`number`
429+
430+
___
431+
432+
### scannedCount
433+
434+
`get` **scannedCount**(): `number`
435+
436+
Number of items scanned by DynamoDB
437+
438+
#### Returns
439+
440+
`number`
441+
343442
## Methods
344443

345444
### [asyncIterator]
@@ -398,28 +497,40 @@ ___
398497

399498
### from
400499

401-
**from**(`nextToken`): [`PaginationResponse`](#PaginationResponse)<`T`\>
500+
**from**<`L`\>(`nextToken`): `L`
402501

403502
Start returning results starting from `nextToken`
404503

504+
#### Type parameters
505+
506+
| Name | Type |
507+
| :------ | :------ |
508+
| `L` | extends [`PaginationResponse`](#PaginationResponse)<`T`, `L`\> |
509+
405510
#### Parameters
406511

407512
| Name | Type |
408513
| :------ | :------ |
409-
| `nextToken` | `string` |
514+
| `nextToken` | `undefined` \| `string` |
410515

411516
#### Returns
412517

413-
[`PaginationResponse`](#PaginationResponse)<`T`\>
518+
`L`
414519

415520
___
416521

417522
### limit
418523

419-
**limit**(`limit`): [`PaginationResponse`](#PaginationResponse)<`T`\>
524+
**limit**<`L`\>(`limit`): `L`
420525

421526
Limit the number of results to `limit`. Will return at least `limit` results even when using FilterExpressions.
422527

528+
#### Type parameters
529+
530+
| Name | Type |
531+
| :------ | :------ |
532+
| `L` | extends [`PaginationResponse`](#PaginationResponse)<`T`, `L`\> |
533+
423534
#### Parameters
424535

425536
| Name | Type |
@@ -428,7 +539,7 @@ Limit the number of results to `limit`. Will return at least `limit` results eve
428539

429540
#### Returns
430541

431-
[`PaginationResponse`](#PaginationResponse)<`T`\>
542+
`L`
432543

433544
___
434545

img/waterfall.svg

Lines changed: 1 addition & 0 deletions
Loading

package.json

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "@emdgroup/dynamodb-paginator",
3-
"version": "2.1.4",
3+
"version": "2.2.0",
44
"main": "dist/cjs/index.js",
55
"exports": {
66
"import": "./dist/esm/index.js",
@@ -14,17 +14,20 @@
1414
"keywords": [
1515
"aws",
1616
"dynamodb",
17-
"pagination"
17+
"pagination",
18+
"scroll",
19+
"scan",
20+
"query"
1821
],
1922
"files": [
2023
"dist"
2124
],
2225
"dependencies": {
23-
"@aws-sdk/client-dynamodb": "^3.0.0",
24-
"@aws-sdk/lib-dynamodb": "^3.0.0",
25-
"@aws-sdk/smithy-client": "^3.0.0",
26-
"@aws-sdk/types": "^3.0.0",
27-
"@aws-sdk/util-dynamodb": "^3.0.0"
26+
"@aws-sdk/client-dynamodb": "^3.171.0",
27+
"@aws-sdk/lib-dynamodb": "^3.171.0",
28+
"@aws-sdk/smithy-client": "^3.171.0",
29+
"@aws-sdk/types": "^3.171.0",
30+
"@aws-sdk/util-dynamodb": "^3.171.0"
2831
},
2932
"devDependencies": {
3033
"@tsconfig/node14": "^1.0.0",

0 commit comments

Comments
 (0)