How to Paginate Large API Datasets: Limit & Offset
Understand how to handle large datasets efficiently using `limit`, `offset`, column filtering, and sorting modifiers.
When working with high-resolution financial market data, response payloads can become exceptionally large. The algoseek Datasets API provides a robust set of query parameters designed to help you efficiently paginate through millions of rows, optimize network transfer sizes, and shape the data to fit your exact analytical models.
This guide details the standard query parameters available across the /data/ endpoints and demonstrates how to implement them effectively.
1. Pagination (limit and offset)#
Data endpoints return results in subsets (pages) to ensure system stability and predictable response times. You control pagination using the limit and offset parameters.
limit: Defines the maximum number of records to return in a single request.- Default:
1000 - Maximum:
10000
- Default:
offset: Specifies the number of records to skip before beginning to return data.- Default:
0
- Default:
Handling the Pagination Response:
When returning JSON, the API includes a pagination metadata block alongside your data array. You should rely on the next_offset boolean to determine if you need to make subsequent requests.
Example Request:
curl -X --get \
https://api.devalgoseek.com/api/v1/data/us-equity/eq-daily-ohlc \
--data 'Ticker=AAPL' \
--data 'limit=5000' \
--data 'offset=0' \
-H "X-API-KEY: YOUR_API_KEY"
Example Response Block:
{
"data": [
{
"TradeDate": "2007-01-03",
"Ticker": "AAPL",
"ASID": 1010000000001033,
"OpenPrice": 86.28,
"HighPrice": 86.58,
"LowPrice": 81.9,
"ClosePrice": 83.76,
"MarketHoursVolume": 43432630,
"MarketHoursFinraVolume": 10696551,
"DailyVolume": 45105870,
"DailyFinraVolume": 11540291,
"MarketHoursVWAP": 84.848,
"DailyVWAP": 84.8501
},
{
"TradeDate": "2007-01-04",
"Ticker": "AAPL",
"ASID": 1010000000001033,
"OpenPrice": 84.17,
"HighPrice": 85.95,
"LowPrice": 83.82,
"ClosePrice": 85.66,
"MarketHoursVolume": 29812464,
"MarketHoursFinraVolume": 7804137,
"DailyVolume": 30928848,
"DailyFinraVolume": 8317690,
"MarketHoursVWAP": 85.1213,
"DailyVWAP": 85.1256
},
{
"TradeDate": "2007-01-05",
"Ticker": "AAPL",
"ASID": 1010000000001033,
"OpenPrice": 85.84,
"HighPrice": 86.2,
"LowPrice": 84.4,
"ClosePrice": 85.05,
"MarketHoursVolume": 29443139,
"MarketHoursFinraVolume": 6892481,
"DailyVolume": 30202106,
"DailyFinraVolume": 7133007,
"MarketHoursVWAP": 85.1939,
"DailyVWAP": 85.1969
},
{
"TradeDate": "2007-01-08",
"Ticker": "AAPL",
"ASID": 1010000000001033,
"OpenPrice": 85.98,
"HighPrice": 86.53,
"LowPrice": 85.28,
"ClosePrice": 85.47,
"MarketHoursVolume": 28103154,
"MarketHoursFinraVolume": 7747266,
"DailyVolume": 28701137,
"DailyFinraVolume": 7896292,
"MarketHoursVWAP": 85.9176,
"DailyVWAP": 85.9128
},
...
],
"pagination": {
"offset": 0,
"limit": 5000,
"next_offset": null
}
}
To fetch the next page, simply execute the same query but set offset=5000.
2. Column Filtering (columns)#
By default, the API returns all available fields for a given dataset. To reduce payload size and parsing time, use the columns parameter to request only the specific data points your model requires.
Financial datasets often contain dozens of columns per record (e.g., US Equities Trade and Quote Daily Bar contains 65 data points). If your model only needs a few specific fields, use the columns query parameter to reduce payload size and speed up the transfer.
Provide a comma-separated list of the exact column names you wish to retrieve. If not provided, the API will return all available columns.
- Syntax: A comma-separated list of exact column names (case-sensitive).
Example Request:
curl -X --get \
https://api.devalgoseek.com/api/v1/data/us-equity/eq-taq/2023-03-01/AAPL \
--data 'columns=TradeDate,EventDateTime,Ticker,Price,Quantity' \
-H "X-API-KEY: YOUR_API_KEY"
Example Response:
{
"data": [
{
"TradeDate": "2023-03-01",
"EventDateTime": "2023-03-01 03:59:00.051594605",
"Ticker": "AAPL",
"Price": 0,
"Quantity": 0
},
{
"TradeDate": "2023-03-01",
"EventDateTime": "2023-03-01 03:59:00.051594605",
"Ticker": "AAPL",
"Price": 0,
"Quantity": 0
},
{
"TradeDate": "2023-03-01",
"EventDateTime": "2023-03-01 03:59:00.051594605",
"Ticker": "AAPL",
"Price": 0,
"Quantity": 0
},
{
"TradeDate": "2023-03-01",
"EventDateTime": "2023-03-01 03:59:00.051594605",
"Ticker": "AAPL",
"Price": 0,
"Quantity": 0
},
{
"TradeDate": "2023-03-01",
"EventDateTime": "2023-03-01 03:59:00.204793170",
"Ticker": "AAPL",
"Price": 0,
"Quantity": 0
},
{
"TradeDate": "2023-03-01",
"EventDateTime": "2023-03-01 03:59:00.204793170",
"Ticker": "AAPL",
"Price": 0,
"Quantity": 0
},
...
],
"pagination": {
"offset": 0,
"limit": 10000,
"next_offset": 10000
}
}
3. Sorting (sort)#
The sort parameter allows you to define the ordering of the returned records. Depending on the dataset, multiple sorting fields may be supported.
- Ascending Order: Prefix the column name with
+(or provide the column name with no prefix). - Descending Order: Prefix the column name with
-. - Multiple Columns: Provide a comma-separated list. The sort is applied in the order the fields are listed.
Example Request (Sort by descending Date, then ascending ClosePrice):
curl -X --get \
'https://api.devalgoseek.com/api/v1/data/us-equity/eq-daily-ohlc' \
--date 'Ticker=AAPL' \
--data 'sort=-TradeDate,+ClosePrice' \
-H "X-API-KEY: YOUR_API_KEY"
4. Response Formatting (response_format)#
While JSON is the default format, it is not always the most efficient choice for bulk ingestion into pandas DataFrames, relational databases, or quantitative frameworks. The API supports direct CSV and compressed CSV downloads.
- Options:
json(Default)csv(Returnstext/csvdata)csv_gzip(Returnsapplication/gzipcontaining data in CSV format)
Example Request (Download compressed CSV):
curl -X --get \
'https://api.devalgoseek.com/api/v1/data/us-equity/eq-taq/2023-08-02/AAPL' \
--data 'response_format=csv_gzip" \
-H "X-API-KEY: YOUR_API_KEY" \
--output 'AAPL_20230802_TAQ.csv.gz'