Package: tidyfst 1.8.4

Tian-Yuan Huang

tidyfst: Tidy Verbs for Fast Data Manipulation

A toolkit of tidy data manipulation verbs with 'data.table' as the backend. Combining the merits of syntax elegance from 'dplyr' and computing performance from 'data.table', 'tidyfst' intends to provide users with state-of-the-art data manipulation tools with least pain. This package is an extension of 'data.table'. While enjoying a tidy syntax, it also wraps combinations of efficient functions to facilitate frequently-used data operations.

Authors:Tian-Yuan Huang [aut, cre]

tidyfst_1.8.4.tar.gz
tidyfst_1.8.4.zip(r-4.7)tidyfst_1.8.4.zip(r-4.6)tidyfst_1.8.4.zip(r-4.5)
tidyfst_1.8.4.tgz(r-4.6-any)tidyfst_1.8.4.tgz(r-4.5-any)
tidyfst_1.8.4.tar.gz(r-4.7-any)tidyfst_1.8.4.tar.gz(r-4.6-any)
tidyfst_1.8.4.tgz(r-4.6-emscripten)
manual.pdf |manual.html
DESCRIPTION
card.svg |card.png
tidyfst/json (API)

# Install 'tidyfst' in R:
install.packages('tidyfst', repos = c('https://fastverse.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/hope-data-science/tidyfst/issues

Pkgdown/docs site:https://hope-data-science.github.io

On CRAN:

Conda:

8.99 score 108 stars 1 packages 127 scripts 1.3k downloads 133 exports 13 dependencies

Last updated from:2b8474bd4b. Checks:5 WARNING, 2 OK, 2 ERROR. Indexed: no.

TargetResultTimeFilesSyslog
linux-devel-x86_64WARNING144
source / vignettesOK208
linux-release-x86_64WARNING138
macos-release-arm64WARNING109
macos-oldrel-arm64ERROR94
windows-develWARNING108
windows-releaseWARNING89
windows-oldrelERROR113
wasm-releaseOK108

Exports:%>%%chin%%like%add_count_dtadd_propanti_join_dtarrange_dtas_dtas_fstas.data.tablebetweenbind_rows_dtbind_tf_idf_dtchop_dtCJcol_maxcol_mincol_rncomplete_dtcopycount_dtcummeandata.tabledelete_na_colsdelete_na_rowsdf_matdistinct_dtdrop_na_dtdummy_dtexport_fstfcasefcoalescefill_na_dtfilter_dtfilter_fstfintersectfreadfrollapplyfsetdifffsetequalfull_join_dtfunionfwriteget_fst_chunk_sizegroup_by_dtgroup_dtgroup_exe_dtimport_fstimport_fst_chunkedimpute_dtin_dtinner_join_dtintersect_dtkeylag_dtlead_dtleft_join_dtlikelonger_dtmat_dfmaxthminthmutate_dtmutate_varsmutate_whennest_dtnthobject_sizepairwise_count_dtparse_fstpercentpkg_loadpkg_unloadprint_optionspstpull_dtrbindlistrec_charrec_numrelocate_dtrename_dtrename_with_dtreplace_dtreplace_na_dtright_join_dtrleidrleidvrn_colround0rowwise_dtsample_dtsample_frac_dtsample_n_dtselect_dtselect_fstselect_mixsemi_join_dtseparate_dtsetsetDFsetdiff_dtsetDTsetequal_dtsetnamesshift_fillslice_dtslice_fstslice_head_dtslice_max_dtslice_min_dtslice_sample_dtslice_tail_dtsql_join_dtsqueeze_dtsummarise_dtsummarise_varssummarise_whensummarize_dtsummarize_varssummarize_whensummary_fstsys_time_printt_dttablestransmute_dtunchop_dtuncount_dtunion_dtuniqueNunite_dtunnest_dtutf8_encodingwider_dt

Dependencies:clidata.tablefstfstcoregluelifecyclemagrittrpakRcpprlangstringistringrvctrs

Use data.table the tidy way: An ultimate tutorial of tidyfst
Create example data | Basic operations | Filter rows | Sort rows | Select columns | Summarise data | group computation (by) | Going further | Advanced columns manipulation | Advanced use of by | Miscellaneous | Read / Write data | Reshape data | Other | Join/Bind data sets | Join | Bind | Set operations | Summary

Last update: 2025-05-07
Started: 2020-03-24

Performance
Q1 | Q2 | Q3 | Q4 | Q5 | Last words | Session information

Last update: 2024-04-15
Started: 2020-07-11

Example 1: Basic usage
Use tidyfst just like dplyr | Filter rows with filter_dt() | Arrange rows with arrange_dt() | Select columns with select_dt() | Add new columns with mutate_dt() | Summarise values with summarise_dt() | Randomly sample rows with sample_n_dt() and sample_frac_dt() | Grouped operations | Comparison with data.table syntax | Data | Subset rows | Select column(s) | Mixed computation | Refer to columns by names | Aggregations

Last update: 2022-04-27
Started: 2020-02-15

Example 2: Join tables
Controlling how the tables are matched | Types of join | Filtering joins | Set operations

Last update: 2022-04-27
Started: 2020-02-15

Example 3: Reshape
Longer | Wider | More complicated example

Last update: 2022-04-27
Started: 2020-02-15

Example 4: Nest

Last update: 2022-04-27
Started: 2020-02-15

Example 5: Fst

Last update: 2022-04-27
Started: 2020-02-27

Example 6: Dt

Last update: 2022-04-27
Started: 2020-03-04

tidyfst包实例分析
测试数据构造 | 基础 | 小技巧 | 聚合 | 1.求每种切割类型、每种颜色钻石的平均价格、中位数价格与最高价格 | 2.求每天最高出售价格对应的那笔订单 | join | 1.dat1与dat2以dt列左连接 | 2.多重join | 长宽表转换 | 1.长表转宽表 | 2.宽表转长表 | 高阶 | 向上/下填充空值 | 添加子维度聚合结果为新列 | 1.以dat1为例,添加两列,一列为以cut、color聚合求price的均值,另一列是求标准差 | 2.以dat1为例,以dt分组添加一列序号id | 移动函数 | 系统参数

Last update: 2020-05-02
Started: 2020-03-14

Readme and manuals

Help Manual

Help pageTopics
Not in operator%notin%
Arrange entries in data.framearrange_dt
Save a data.frame as a fst tableas_fst
Bind multiple data frames by rowbind_rows_dt
Compute TF–IDF Using data.table with Optional Counting and Groupingbind_tf_idf_dt
Get the column name of the max/min number each rowcol_max col_min
Complete a data frame with missing combinations of datacomplete_dt
Count observations by groupadd_count_dt count_dt
Cumulative meancummean
Select distinct/unique rows in data.framedistinct_dt
Dump, replace and fill missing values in data.framedelete_na_cols delete_na_rows drop_na_dt fill_na_dt replace_na_dt shift_fill
Fast creation of dummy variablesdummy_dt
Read and write fst filesexport_fst import_fst
Filter entries in data.framefilter_dt
Parse,inspect and extract data.table from fst filefilter_fst fst parse_fst select_fst slice_fst summary_fst
Group by variable(s) and implement operationsgroup_by_dt group_exe_dt
Data manipulation within groupsgroup_dt rowwise_dt
Read a fst file by chunksget_fst_chunk_size import_fst_chunked
Impute missing values with mean, median or modeimpute_dt
Short cut to data.tableas_dt in_dt
Set operations for data framesintersect_dt setdiff_dt setequal_dt union_dt
Join tablesanti_join_dt full_join_dt inner_join_dt join left_join_dt right_join_dt semi_join_dt
Fast lead/lag for vectorslag_dt lead_dt
Pivot data from wide to longlonger_dt
Conversion between tidy table and named matrixdf_mat mat_df
Mutate columns in data.framemutate_dt transmute_dt
Conditional update of columns in data.tablemutate_vars mutate_when
Nest and unnestchop_dt nest_dt squeeze_dt unchop_dt unnest_dt
Extract the nth value from a vectormaxth minth nth
Nice printing of report the Space Allocated for an Objectobject_size
Count pairs of items within a grouppairwise_count_dt
Add percentage to counts in data.frameadd_prop percent
Load or unload R package(s)pkg_load pkg_unload
Set global printing method for data.tableprint_options
Pull out a single variablepull_dt
Recode number or stringsrec rec_char rec_num
Change column orderrelocate_dt
Rename column in data.framerename_dt rename_with_dt
Fast value replacement in data framereplace_dt
Tools for working with row namescol_rn rn_col
Round a number and make it show zerosround0
Sample rows randomly from a tablesample_dt sample_frac_dt sample_n_dt
Select column from data.frameselect_dt select_mix
Separate a character column into two columns using a regular expression separatorseparate_dt
Subset rows using their positionsslice_dt slice_head_dt slice_max_dt slice_min_dt slice_sample_dt slice_tail_dt
Case insensitive table joining like SQLsql_join sql_join_dt
Summarise columns to single valuessummarise_dt summarise_vars summarise_when summarize_dt summarize_vars summarize_when
Convenient print of time takenpst sys_time_print
Efficient transpose of data.framet_dt
"Uncount" a data frameuncount_dt
Unite multiple columns into one by pasting strings togetherunite_dt
Use UTF-8 for character encoding in a data frameutf8_encoding
Pivot data from long to widewider_dt