Read SQL query into a DataFrame. If neither ``divisions`` or ``npartitions`` is given, the memory footprint of the first few rows will be determined, and partitions of size ~256MB will be used. Parameters ---------- sql : SQLAlchemy Selectable SQL query to be e
(
sql,
con,
index_col,
divisions=None,
npartitions=None,
limits=None,
bytes_per_chunk="256 MiB",
head_rows=5,
meta=None,
engine_kwargs=None,
**kwargs,
)
| 15 | |
| 16 | |
| 17 | def read_sql_query( |
| 18 | sql, |
| 19 | con, |
| 20 | index_col, |
| 21 | divisions=None, |
| 22 | npartitions=None, |
| 23 | limits=None, |
| 24 | bytes_per_chunk="256 MiB", |
| 25 | head_rows=5, |
| 26 | meta=None, |
| 27 | engine_kwargs=None, |
| 28 | **kwargs, |
| 29 | ): |
| 30 | """ |
| 31 | Read SQL query into a DataFrame. |
| 32 | |
| 33 | If neither ``divisions`` or ``npartitions`` is given, the memory footprint of the |
| 34 | first few rows will be determined, and partitions of size ~256MB will |
| 35 | be used. |
| 36 | |
| 37 | Parameters |
| 38 | ---------- |
| 39 | sql : SQLAlchemy Selectable |
| 40 | SQL query to be executed. TextClause is not supported |
| 41 | con : str |
| 42 | Full sqlalchemy URI for the database connection |
| 43 | index_col : str |
| 44 | Column which becomes the index, and defines the partitioning. Should |
| 45 | be a indexed column in the SQL server, and any orderable type. If the |
| 46 | type is number or time, then partition boundaries can be inferred from |
| 47 | ``npartitions`` or ``bytes_per_chunk``; otherwise must supply explicit |
| 48 | ``divisions``. |
| 49 | divisions: sequence |
| 50 | Values of the index column to split the table by. If given, this will |
| 51 | override ``npartitions`` and ``bytes_per_chunk``. The divisions are the value |
| 52 | boundaries of the index column used to define the partitions. For |
| 53 | example, ``divisions=list('acegikmoqsuwz')`` could be used to partition |
| 54 | a string column lexicographically into 12 partitions, with the implicit |
| 55 | assumption that each partition contains similar numbers of records. |
| 56 | npartitions : int |
| 57 | Number of partitions, if ``divisions`` is not given. Will split the values |
| 58 | of the index column linearly between ``limits``, if given, or the column |
| 59 | max/min. The index column must be numeric or time for this to work |
| 60 | limits: 2-tuple or None |
| 61 | Manually give upper and lower range of values for use with ``npartitions``; |
| 62 | if None, first fetches max/min from the DB. Upper limit, if |
| 63 | given, is inclusive. |
| 64 | bytes_per_chunk : str or int |
| 65 | If both ``divisions`` and ``npartitions`` is None, this is the target size of |
| 66 | each partition, in bytes |
| 67 | head_rows : int |
| 68 | How many rows to load for inferring the data-types, and memory per row |
| 69 | meta : empty DataFrame or None |
| 70 | If provided, do not attempt to infer dtypes, but use these, coercing |
| 71 | all chunks on load |
| 72 | engine_kwargs : dict or None |
| 73 | Specific db engine parameters for sqlalchemy |
| 74 | kwargs : dict |
searching dependent graphs…