MCPcopy
hub / github.com/dask/dask / read_sql_query

Function read_sql_query

dask/dataframe/io/sql.py:17–192  ·  view source on GitHub ↗

Read SQL query into a DataFrame. If neither ``divisions`` or ``npartitions`` is given, the memory footprint of the first few rows will be determined, and partitions of size ~256MB will be used. Parameters ---------- sql : SQLAlchemy Selectable SQL query to be e

(
    sql,
    con,
    index_col,
    divisions=None,
    npartitions=None,
    limits=None,
    bytes_per_chunk="256 MiB",
    head_rows=5,
    meta=None,
    engine_kwargs=None,
    **kwargs,
)

Source from the content-addressed store, hash-verified

15
16
17def read_sql_query(
18 sql,
19 con,
20 index_col,
21 divisions=None,
22 npartitions=None,
23 limits=None,
24 bytes_per_chunk="256 MiB",
25 head_rows=5,
26 meta=None,
27 engine_kwargs=None,
28 **kwargs,
29):
30 """
31 Read SQL query into a DataFrame.
32
33 If neither ``divisions`` or ``npartitions`` is given, the memory footprint of the
34 first few rows will be determined, and partitions of size ~256MB will
35 be used.
36
37 Parameters
38 ----------
39 sql : SQLAlchemy Selectable
40 SQL query to be executed. TextClause is not supported
41 con : str
42 Full sqlalchemy URI for the database connection
43 index_col : str
44 Column which becomes the index, and defines the partitioning. Should
45 be a indexed column in the SQL server, and any orderable type. If the
46 type is number or time, then partition boundaries can be inferred from
47 ``npartitions`` or ``bytes_per_chunk``; otherwise must supply explicit
48 ``divisions``.
49 divisions: sequence
50 Values of the index column to split the table by. If given, this will
51 override ``npartitions`` and ``bytes_per_chunk``. The divisions are the value
52 boundaries of the index column used to define the partitions. For
53 example, ``divisions=list('acegikmoqsuwz')`` could be used to partition
54 a string column lexicographically into 12 partitions, with the implicit
55 assumption that each partition contains similar numbers of records.
56 npartitions : int
57 Number of partitions, if ``divisions`` is not given. Will split the values
58 of the index column linearly between ``limits``, if given, or the column
59 max/min. The index column must be numeric or time for this to work
60 limits: 2-tuple or None
61 Manually give upper and lower range of values for use with ``npartitions``;
62 if None, first fetches max/min from the DB. Upper limit, if
63 given, is inclusive.
64 bytes_per_chunk : str or int
65 If both ``divisions`` and ``npartitions`` is None, this is the target size of
66 each partition, in bytes
67 head_rows : int
68 How many rows to load for inferring the data-types, and memory per row
69 meta : empty DataFrame or None
70 If provided, do not attempt to infer dtypes, but use these, coercing
71 all chunks on load
72 engine_kwargs : dict or None
73 Specific db engine parameters for sqlalchemy
74 kwargs : dict

Callers 6

test_queryFunction · 0.90
test_query_with_metaFunction · 0.90
read_sql_tableFunction · 0.85
read_sqlFunction · 0.85

Calls 11

pyarrow_strings_enabledFunction · 0.90
parse_bytesFunction · 0.90
delayedFunction · 0.90
roundFunction · 0.85
from_delayedMethod · 0.80
sumMethod · 0.45
memory_usageMethod · 0.45
maxMethod · 0.45
minMethod · 0.45
countMethod · 0.45
whereMethod · 0.45

Tested by 4

test_queryFunction · 0.72
test_query_with_metaFunction · 0.72

Used in the wild real call sites across dependent graphs

searching dependent graphs…