MCPcopy
hub / github.com/Kanaries/pygwalker

github.com/Kanaries/pygwalker @0.5.0.1 sqlite

repository ↗ · DeepWiki ↗ · release 0.5.0.1 ↗
647 symbols 2,317 edges 128 files 206 documented · 32%
README

English | Español | Français | Deutsch | 中文 | Türkçe | 日本語 | 한국어 | Русский

PyGWalker: A Python Library for Exploratory Data Analysis with Visualization

<a href="https://arxiv.org/abs/2406.11637">
  <img src="https://img.shields.io/badge/arXiv-2406.11637-b31b1b.svg" height="18" align="center">
</a>
<a href="https://badge.fury.io/py/pygwalker">
    <img src="https://badge.fury.io/py/pygwalker.svg" alt="PyPI version" height="18" align="center" />
</a>
<a href="https://mybinder.org/v2/gh/Kanaries/pygwalker/main">
  <img src="https://mybinder.org/badge_logo.svg" alt="binder" height="18" align="center" />
</a>
<a href="https://pypi.org/project/pygwalker">
  <img src="https://img.shields.io/pypi/dm/pygwalker" alt="PyPI downloads" height="18" align="center" />
</a>
<a href="https://anaconda.org/conda-forge/pygwalker"> <img src="https://anaconda.org/conda-forge/pygwalker/badges/version.svg" alt="conda-forge" height="18" align="center" /> </a>







<a href="https://discord.gg/Z4ngFWXz2U">
  <img alt="discord invitation link" src="https://dcbadge.vercel.app/api/server/Z4ngFWXz2U?style=flat" align="center" />
</a>
<a href='https://twitter.com/intent/follow?original_referer=https%3A%2F%2Fpublish.twitter.com%2F&ref_src=twsrc%5Etfw&screen_name=kanaries_data&tw_p=followbutton'>
    <img alt="Twitter Follow" src="https://img.shields.io/twitter/follow/kanaries_data?style=social" alt='Twitter' align="center" />
</a>
<a href="https://kanaries-community.slack.com/join/shared_invite/zt-20kpp56wl-ke9S0MxTcNQjUhKf6SOfvQ#/shared-invite/email">
  <img src="https://img.shields.io/badge/Slack-green?style=flat-square&logo=slack&logoColor=white" alt="Join Kanaries on Slack" align="center" />
</a>

PyGWalker can simplify your Jupyter Notebook data analysis and data visualization workflow, by turning your pandas dataframe into an interactive user interface for visual exploration.

PyGWalker (pronounced like "Pig Walker", just for fun) is named as an abbreviation of "Python binding of Graphic Walker". It integrates Jupyter Notebook with Graphic Walker, an open-source alternative to Tableau. It allows data scientists to visualize / clean / annotates the data with simple drag-and-drop operations and even natural language queries.

https://github.com/Kanaries/pygwalker/assets/22167673/2b940e11-cf8b-4cde-b7f6-190fb10ee44b

[!TIP] If you want more AI features, we also build runcell, an AI Code Agent in Jupyter that understands your code/data/cells and generate code, execute cells and take actions for you. It can be used in jupyter lab with pip install runcell

https://github.com/user-attachments/assets/9ec64252-864d-4bd1-8755-83f9b0396d38

Visit Google Colab, Kaggle Code or Graphic Walker Online Demo to test it out!

If you prefer using R, check GWalkR, the R wrapper of Graphic Walker. If you prefer a Desktop App that can be used offline and without any coding, check out PyGWalker Desktop.

Features

PyGWalker is a Python library that simplifies data analysis and visualization workflows by turning pandas DataFrames into interactive visual interfaces. It offers a variety of features that make it a powerful tool for data exploration: - ##### Interactive Data Exploration: - Drag-and-drop interface for easy visualization creation.   - Real-time updates as you make changes to the visualization. - Ability to zoom, pan, and filter the data.   - ##### Data Cleaning and Transformation: - Visual data cleaning tools to identify and remove outliers or inconsistencies.   - Ability to create new variables and features based on existing data.   - ##### Advanced Visualization Capabilities: - Support for various chart types (bar charts, line charts, scatter plots, etc.). - Customization options for colors, labels, and other visual elements.   - Interactive features like tooltips and drill-down capabilities.   - ##### Integration with Jupyter Notebooks: - Seamless integration with Jupyter Notebooks for a smooth workflow.   - ##### Open-Source and Free: - Available for free and allows for customization and extension.

Getting Started

Check our video tutorial about using pygwalker, pygwalker + streamlit and pygwalker + snowflake, How to explore data with PyGWalker in Python

Run in Kaggle Run in Colab
Kaggle Code Google Colab

Setup pygwalker

Before using pygwalker, make sure to install the packages through the command line using pip or conda.

pip

pip install pygwalker

Note

For an early trial, you can install with pip install pygwalker --upgrade to keep your version up to date with the latest release or even pip install pygwalker --upgrade --pre to obtain latest features and bug-fixes.

Conda-forge

conda install -c conda-forge pygwalker

or

mamba install -c conda-forge pygwalker

See conda-forge feedstock for more help.

Use pygwalker in Jupyter Notebook

Quick Start

Import pygwalker and pandas to your Jupyter Notebook to get started.

import pandas as pd
import pygwalker as pyg

You can use pygwalker without breaking your existing workflow. For example, you can call up PyGWalker with the dataframe loaded in this way:

df = pd.read_csv('./bike_sharing_dc.csv')
walker = pyg.walk(df)

That's it. Now you have an interactive UI to analyze and visualize data with simple drag-and-drop operations.

Cool things you can do with PyGwalker:

  • You can change the mark type into others to make different charts, for example, a line chart: graphic walker line chart

  • To compare different measures, you can create a concat view by adding more than one measure into rows/columns. graphic walker area chart

  • To make a facet view of several subviews divided by the value in dimension, put dimensions into rows or columns to make a facets view. graphic walker scatter chart

  • PyGWalker contains a powerful data table, which provides a quick view of data and its distribution, profiling. You can also add filters or change the data types in the table. pygwalker-data-preview

  • You can save the data exploration result to a local file

Better Practices

There are some important parameters you should know when using pygwalker:

  • spec: for save/load chart config (json string or file path)
  • kernel_computation: for using duckdb as computing engine which allows you to handle larger dataset faster in your local machine.
  • use_kernel_calc: Deprecated, use kernel_computation instead.
df = pd.read_csv('./bike_sharing_dc.csv')
walker = pyg.walk(
    df,
    spec="./chart_meta_0.json",    # this json file will save your chart state, you need to click save button in ui mannual when you finish a chart, 'autosave' will be supported in the future.
    kernel_computation=True,          # set `kernel_computation=True`, pygwalker will use duckdb as computing engine, it support you explore bigger dataset(<=100GB).
)

Example in local notebook

Example in cloud notebook

Programmatic Export of Charts

After saving a chart from the UI, you can retrieve the image directly from Python.

walker = pyg.walk(df, spec="./chart_meta_0.json")
# edit the chart in the UI and click the save button
walker.save_chart_to_file("Chart 1", "chart1.svg", save_type="svg")
png_bytes = walker.export_chart_png("Chart 1")
svg_bytes = walker.export_chart_svg("Chart 1")

Use pygwalker in Streamlit

Streamlit allows you to host a web version of pygwalker without figuring out details of how web application works.

Here are some of the app examples build with pygwalker and streamlit: + PyGWalker + streamlit for Bike sharing dataset + Earthquake Dashboard

from pygwalker.api.streamlit import StreamlitRenderer
import pandas as pd
import streamlit as st

# Adjust the width of the Streamlit page
st.set_page_config(
    page_title="Use Pygwalker In Streamlit",
    layout="wide"
)

# Add Title
st.title("Use Pygwalker In Streamlit")

# You should cache your pygwalker renderer, if you don't want your memory to explode
@st.cache_resource
def get_pyg_renderer() -> "StreamlitRenderer":
    df = pd.read_csv("./bike_sharing_dc.csv")
    # If you want to use feature of saving chart config, set `spec_io_mode="rw"`
    return StreamlitRenderer(df, spec="./gw_config.json", spec_io_mode="rw")


renderer = get_pyg_renderer()

renderer.explorer()

API Reference

pygwalker.walk

Parameter Type Default Description
dataset Union[DataFrame, Connector] - The dataframe or connector to be used.
gid Union[int, str] None ID for the GraphicWalker container div, formatted as 'gwalker-{gid}'.
env Literal['Jupyter', 'JupyterWidget'] 'JupyterWidget' Environment using pygwalker.
field_specs Optional[Dict[str, FieldSpec]] None Specifications of fields. Will be automatically inferred from dataset if not specified.
hide_data_source_config bool True If True, hides DataSource import and export button.
theme_key Literal['vega', 'g2'] 'g2' Theme type for the GraphicWalker.
appearance Literal['media', 'light', 'dark'] 'media' Theme setting. 'media' will auto-detect the OS theme.
spec str "" Chart configuration data. Can be a configur

Extension points exported contracts — how you extend this code

ISolutionProps (Interface)
(no doc)
app/src/components/options.tsx
ICodeExport (Interface)
(no doc)
app/src/components/codeExportModal/index.tsx
IInitModal (Interface)
(no doc)
app/src/components/initModal/index.tsx
RuncellBannerProps (Interface)
(no doc)
app/src/components/runcellBanner/index.tsx
IUploadChartModal (Interface)
(no doc)
app/src/components/uploadChartModal/index.tsx

Core symbols most depended-on inside this repo

_update_single_chart_spec
called by 28
pygwalker/api/component.py
cn
called by 26
app/src/lib/utils.ts
register
called by 19
pygwalker/communications/base.py
copy
called by 18
pygwalker/api/component.py
display_html
called by 17
pygwalker/utils/display.py
_get_props
called by 15
pygwalker/api/pygwalker.py
text
called by 13
pygwalker/api/component.py
encode
called by 13
pygwalker/api/component.py

Shape

Method 287
Function 278
Class 58
Interface 21
Route 3

Languages

Python80%
TypeScript20%

Modules by API surface

pygwalker/api/pygwalker.py47 symbols
pygwalker/api/component.py35 symbols
pygwalker/data_parsers/base.py33 symbols
pygwalker/services/cloud_service.py32 symbols
pygwalker/data_parsers/database_parser.py27 symbols
pygwalker_tools/metrics/api.py20 symbols
app/src/index.tsx19 symbols
pygwalker/api/streamlit.py17 symbols
pygwalker/data_parsers/cloud_dataset_parser.py16 symbols
pygwalker/api/webserver.py16 symbols
pygwalker/utils/custom_sqlglot.py15 symbols
pygwalker/data_parsers/spark_parser.py15 symbols

Dependencies from manifests, versioned

@anywidget/react0.0.8 · 1×
@headlessui/react1.7.14 · 1×
@heroicons/react2.0.8 · 1×
@kanaries/graphic-walker0.5.0-alpha.2 · 1×
@kanaries/gw-dsl-parser0.1.49 · 1×
@radix-ui/react-checkbox1.3.3 · 1×
@radix-ui/react-dialog1.1.15 · 1×
@radix-ui/react-icons1.3.2 · 1×
@radix-ui/react-label2.1.8 · 1×
@radix-ui/react-slot1.2.4 · 1×

Datastores touched

(mysql)Database · 1 repos

For agents

$ claude mcp add pygwalker \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact