Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: parquet #738

Merged
merged 2 commits into from
Nov 19, 2024
Merged

feat: parquet #738

merged 2 commits into from
Nov 19, 2024

Conversation

yihong0618
Copy link
Owner

@yihong0618 yihong0618 commented Nov 19, 2024

you can use duckdb parquet

pip3 install duckdb

import duckdb
duckdb.sql("select regexp_extract(location_country, '[\u4e00-\u9fa5]{2,}(市|自治州|特别行政区)') as run_location, concat(try_cast(sum(distance/1000) as integer)::varchar,' km') as run_distance from read_parquet('https://github.com/yihong0618/run/raw/refs/heads/master/run_page/data.parquet') where run_location is not NULL group by run_location order by sum(distance) desc;").show(max_rows=50)

show

┌──────────────┬──────────────┐
│ run_location │ run_distance │
│   varchar    │   varchar    │
├──────────────┼──────────────┤
│ 大连市       │ 9328 km      │
│ 沈阳市       │ 2030 km      │
│ 北京市       │ 61 km        │
│ 长沙市       │ 24 km        │
│ 扬州市       │ 21 km        │
│ 盘锦市       │ 21 km        │
│ 烟台市       │ 21 km        │
│ 上海市       │ 12 km        │
│ 北九州市     │ 7 km         │
│ 丹东市       │ 5 km         │
│ 瓦房店市     │ 4 km         │
│ 竹田市       │ 3 km         │
│ 伊万里市     │ 2 km         │
│ 长春市       │ 1 km         │
│ 锦州市       │ 1 km         │
│              │ 0 km         │
├──────────────┴──────────────┤
│ 16 rows           2 columns │
└─────────────────────────────┘

Signed-off-by: yihong0618 <[email protected]>
Copy link

vercel bot commented Nov 19, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
running-page ✅ Ready (Inspect) Visit Preview 💬 Add feedback Nov 19, 2024 3:28am

@yihong0618 yihong0618 merged commit 058a222 into master Nov 19, 2024
10 checks passed
@yihong0618 yihong0618 changed the title feat: pargent feat: parquet Nov 19, 2024
@yihong0618 yihong0618 deleted the support_duckdb branch November 19, 2024 03:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant