hub / github.com/dask/dask / read_sql_query

Function read_sql_query

dask/dataframe/io/sql.py:17–192 · view source on GitHub ↗

Read SQL query into a DataFrame. If neither ``divisions`` or ``npartitions`` is given, the memory footprint of the first few rows will be determined, and partitions of size ~256MB will be used. Parameters ---------- sql : SQLAlchemy Selectable SQL query to be e

(
    sql,
    con,
    index_col,
    divisions=None,
    npartitions=None,
    limits=None,
    bytes_per_chunk="256 MiB",
    head_rows=5,
    meta=None,
    engine_kwargs=None,
    **kwargs,
)

Source from the content-addressed store, hash-verified

15
16
17	def read_sql_query(
18	sql,
19	con,
20	index_col,
21	divisions=None,
22	npartitions=None,
23	limits=None,
24	bytes_per_chunk="256 MiB",
25	head_rows=5,
26	meta=None,
27	engine_kwargs=None,
28	**kwargs,
29	):
30	"""
31	Read SQL query into a DataFrame.
32
33	If neither ``divisions`` or ``npartitions`` is given, the memory footprint of the
34	first few rows will be determined, and partitions of size ~256MB will
35	be used.
36
37	Parameters
38	----------
39	sql : SQLAlchemy Selectable
40	SQL query to be executed. TextClause is not supported
41	con : str
42	Full sqlalchemy URI for the database connection
43	index_col : str
44	Column which becomes the index, and defines the partitioning. Should
45	be a indexed column in the SQL server, and any orderable type. If the
46	type is number or time, then partition boundaries can be inferred from
47	``npartitions`` or ``bytes_per_chunk``; otherwise must supply explicit
48	``divisions``.
49	divisions: sequence
50	Values of the index column to split the table by. If given, this will
51	override ``npartitions`` and ``bytes_per_chunk``. The divisions are the value
52	boundaries of the index column used to define the partitions. For
53	example, ``divisions=list('acegikmoqsuwz')`` could be used to partition
54	a string column lexicographically into 12 partitions, with the implicit
55	assumption that each partition contains similar numbers of records.
56	npartitions : int
57	Number of partitions, if ``divisions`` is not given. Will split the values
58	of the index column linearly between ``limits``, if given, or the column
59	max/min. The index column must be numeric or time for this to work
60	limits: 2-tuple or None
61	Manually give upper and lower range of values for use with ``npartitions``;
62	if None, first fetches max/min from the DB. Upper limit, if
63	given, is inclusive.
64	bytes_per_chunk : str or int
65	If both ``divisions`` and ``npartitions`` is None, this is the target size of
66	each partition, in bytes
67	head_rows : int
68	How many rows to load for inferring the data-types, and memory per row
69	meta : empty DataFrame or None
70	If provided, do not attempt to infer dtypes, but use these, coercing
71	all chunks on load
72	engine_kwargs : dict or None
73	Specific db engine parameters for sqlalchemy
74	kwargs : dict

Callers 6

test_queryFunction · 0.90

test_query_index_from_queryFunction · 0.90

test_query_with_metaFunction · 0.90

test_read_sql_query_string_raises_errorFunction · 0.90

read_sql_tableFunction · 0.85

read_sqlFunction · 0.85

Calls 11

pyarrow_strings_enabledFunction · 0.90

parse_bytesFunction · 0.90

delayedFunction · 0.90

roundFunction · 0.85

from_delayedMethod · 0.80

sumMethod · 0.45

memory_usageMethod · 0.45

maxMethod · 0.45

minMethod · 0.45

countMethod · 0.45

whereMethod · 0.45

Tested by 4

test_queryFunction · 0.72

test_query_index_from_queryFunction · 0.72

test_query_with_metaFunction · 0.72

test_read_sql_query_string_raises_errorFunction · 0.72

Used in the wild real call sites across dependent graphs

searching dependent graphs…