collapse: Advanced and Fast Data Transformation

A C/C++ based package for advanced data transformation and statistical computing in R that is extremely fast, class-agnostic, robust and programmer friendly. Core functionality includes a rich set of S3 generic grouped and weighted statistical functions for vectors, matrices and data frames, which provide efficient low-level vectorizations, OpenMP multithreading, and skip missing values by default. These are integrated with fast grouping and ordering algorithms (also callable from C), and efficient data manipulation functions. The package also provides a flexible and rigorous approach to time series and panel data in R. It further includes fast functions for common statistical procedures, detailed (grouped, weighted) summary statistics, powerful tools to work with nested data, fast data object conversions, functions for memory efficient R programming, and helpers to effectively deal with variable labels, attributes, and missing data. It is well integrated with base R classes, 'dplyr'/'tibble', 'data.table', 'sf', 'units', 'plm' (panel-series and data frames), and 'xts'/'zoo'.

Authors:Sebastian Krantz [aut, cre], Matt Dowle [ctb], Arun Srinivasan [ctb], Morgan Jacob [ctb], Dirk Eddelbuettel [ctb], Laurent Berge [ctb], Kevin Tappe [ctb], R Core Team and contributors worldwide [ctb], Martyn Plummer [cph], 1999-2016 The R Core Team [cph]

  GGDC10S - Groningen Growth and Development Centre 10-Sector Database
  wlddev - World Development Dataset



Help pageTopics
Advanced and Fast Data Transformationcollapse-package collapse
Apply Functions Across Multiple Columnsacross
Fast Row/Column Arithmetic for Matrix-Like Objects%c*% %c+% %c-% %c/% %cr% %r*% %r+% %r-% %r/% %rr% arithmetic
Split-Apply-Combine ComputingBY BY.default BY.grouped_df BY.matrix
Advanced Data AggregationA5-advanced-aggregation advanced-aggregation collap collapg collapv
Collapse Documentation & Overview.COLLAPSE_ALL .COLLAPSE_DATA .COLLAPSE_GENERIC .COLLAPSE_TOPICS A0-collapse-documentation collapse-documentation
_collapse_ Package Options.op AA4-collapse-options collapse-options get_collapse set_collapse
Renamed Functions.COLLAPSE_OLD as.character_factor as.factor_GRP as.factor_qG as.numeric_factor collapse-renamed Date_vars Date_vars<- fHDbetween fHDbetween.default fHDbetween.grouped_df fHDbetween.matrix fHDbetween.pdata.frame fHDbetween.pseries fHDwithin fHDwithin.default fHDwithin.grouped_df fHDwithin.matrix fHDwithin.pdata.frame fHDwithin.pseries fNdistinct fNdistinct.default fNdistinct.grouped_df fNdistinct.matrix fNobs fNobs.default fNobs.grouped_df fNobs.matrix is.categorical is.Date is.GRP is.qG is.unlistable pwNobs replace_Inf replace_NA
Fast Reordering of Data Frame Columnscolorder colorderv
Data Applydapply
Data Transformations.OPERATOR_FUN A6-data-transformations data-transformations
Detailed Statistical Description of Data descr descr.default descr.grouped_df print.descr [.descr
Small Functions to Make R Programming More Efficient%!=% %*=% %+=% %-=% %/=% %==% AA2-efficient-programming allNA alloc allv anyv cinv copyv efficient-programming fdim fncol fnlevels fnrow missing_cases na_focb na_insert na_locf na_omit na_rm seq_col seq_row setop setv vec vgcd vlengths vtypes whichNA whichv
Fast Data ManipulationA3-fast-data-manipulation fast-data-manipulation
Fast Grouping and OrderingA2-fast-grouping-ordering fast-grouping-ordering
Fast (Grouped, Weighted) Statistical Functions for Matrix-Like Objects.FAST_FUN .FAST_STAT_FUN A1-fast-statistical-functions fast-statistical-functions
Fast Between (Averaging) and (Quasi-)Within (Centering) TransformationsB B.default B.grouped_df B.matrix B.pdata.frame B.pseries fbetween fbetween.default fbetween.grouped_df fbetween.matrix fbetween.pdata.frame fbetween.pseries fwithin fwithin.default fwithin.grouped_df fwithin.matrix fwithin.pdata.frame fwithin.pseries W W.default W.grouped_df W.matrix W.pdata.frame W.pseries
Efficiently Count Observations by Groupfcount fcountv
Fast (Grouped, Ordered) Cumulative Sum for Matrix-Like Objectsfcumsum fcumsum.default fcumsum.grouped_df fcumsum.matrix fcumsum.pdata.frame fcumsum.pseries
Fast (Quasi-, Log-) Differences for Time Series and Panel DataD D.default D.grouped_df D.list D.matrix D.pdata.frame D.pseries Dlog Dlog.default Dlog.grouped_df Dlog.list Dlog.matrix Dlog.pdata.frame Dlog.pseries fdiff fdiff.default fdiff.grouped_df fdiff.list fdiff.matrix fdiff.pdata.frame fdiff.pseries
Fast and Flexible Distance Computationsfdist
Fast Removal of Unused Factor Levelsfdroplevels fdroplevels.factor
Fast (Grouped) First and Last Value for Matrix-Like Objectsffirst ffirst.default ffirst.grouped_df ffirst.matrix flast flast.default flast.grouped_df flast.matrix
Fast (Weighted) F-test for Linear Models (with Factors)fFtest fFtest.default fFtest.formula
Fast Growth Rates for Time Series and Panel Datafgrowth fgrowth.default fgrowth.grouped_df fgrowth.list fgrowth.matrix fgrowth.pdata.frame fgrowth.pseries G G.default G.grouped_df G.list G.matrix G.pdata.frame G.pseries
Higher-Dimensional Centering and Linear Predictionfhdbetween fhdbetween.default fhdbetween.matrix fhdbetween.pdata.frame fhdbetween.pseries fhdwithin fhdwithin.default fhdwithin.matrix fhdwithin.pdata.frame fhdwithin.pseries HDB HDB.default HDB.matrix HDB.pdata.frame HDB.pseries HDW HDW.default HDW.matrix HDW.pdata.frame HDW.pseries
Fast Lags and Leads for Time Series and Panel DataF F.default F.grouped_df F.matrix F.pdata.frame F.pseries flag flag.default flag.grouped_df flag.matrix flag.pdata.frame flag.pseries L L.default L.grouped_df L.matrix L.pdata.frame L.pseries
Fast (Weighted) Linear Model Fittingflm flm.default flm.formula
Fast Matching%!iin% %!in% %iin% ckmatch fmatch
Fast (Grouped, Weighted) Mean for Matrix-Like Objectsfmean fmean.default fmean.grouped_df fmean.matrix
Fast (Grouped) Maxima and Minima for Matrix-Like Objectsfmax fmax.default fmax.grouped_df fmax.matrix fmin fmin.default fmin.grouped_df fmin.matrix
Fast (Grouped, Weighted) Statistical Mode for Matrix-Like Objectsfmode fmode.default fmode.grouped_df fmode.matrix
Fast (Grouped) Distinct Value Count for Matrix-Like Objectsfndistinct fndistinct.default fndistinct.grouped_df fndistinct.matrix
Fast (Grouped) Observation Count for Matrix-Like Objectsfnobs fnobs.default fnobs.grouped_df fnobs.matrix
Fast (Grouped, Weighted) N'th Element/Quantile for Matrix-Like Objectsfmedian fmedian.default fmedian.grouped_df fmedian.matrix fnth fnth.default fnth.grouped_df fnth.matrix
Fast (Grouped, Weighted) Product for Matrix-Like Objectsfprod fprod.default fprod.grouped_df fprod.matrix
Fast (Weighted) Sample Quantiles and Range.quantile .range fquantile frange
Fast Renaming and Relabelling Objectsfrename relabel rnm setrelabel setrename
Fast (Grouped, Weighted) Scaling and Centering of Matrix-like Objectsfscale fscale.default fscale.grouped_df fscale.matrix fscale.pdata.frame fscale.pseries STD STD.default STD.grouped_df STD.matrix STD.pdata.frame STD.pseries
Fast Select, Replace or Add Data Frame Columnsadd_vars add_vars<- av av<- cat_vars cat_vars<- char_vars char_vars<- date_vars date_vars<- fact_vars fact_vars<- fselect fselect<- get_vars get_vars<- gv gv<- gvr gvr<- logi_vars logi_vars<- num_vars num_vars<- nv nv<- slt slt<-
Fast Subsetting Matrix-Like Objectsfsubset fsubset.default fsubset.matrix fsubset.pdata.frame fsubset.pseries sbt ss
Fast (Grouped, Weighted) Sum for Matrix-Like Objectsfsum fsum.default fsum.grouped_df fsum.matrix
Fast Summarisefsummarise fsummarize smr
Fast Transform and Compute Columns on a Data Framefcompute fcomputev fmutate ftransform ftransform<- ftransformv mtt settfm settfmv settransform settransformv tfm tfm<- tfmv
Fast Unique Elements / Rowsany_duplicated fduplicated fnunique funique funique.default funique.pdata.frame funique.pseries funique.sf
Fast (Grouped, Weighted) Variance and Standard Deviation for Matrix-Like Objectsfsd fsd.default fsd.grouped_df fsd.matrix fvar fvar.default fvar.grouped_df fvar.matrix
Find and Extract / Subset List Elementsatomic_elem atomic_elem<- get_elem has_elem irreg_elem list_elem list_elem<- reg_elem
Groningen Growth and Development Centre 10-Sector DatabaseGGDC10S
Fast Hash-Based Groupinggroup
Generate Run-Length Type Group-Idgroupid
Fast Grouping / _collapse_ Grouping Objectsas_factor_GRP fgroup_by fgroup_vars fungroup gby greorder group_by_vars GRP GRP.default GRP.factor GRP.grouped_df GRP.GRP GRP.pdata.frame GRP.pseries GRP.qG GRPid GRPN GRPnames gsplit is_GRP length.GRP plot.GRP print.GRP
Fast Indexed Time Series and Panels$.indexed_frame $<-.indexed_frame findex findex_by iby indexing is_irregular ix print.index_df reindex to_plm unindex [.indexed_frame [.indexed_series [.index_df [<-.indexed_frame [[.indexed_frame [[<-.indexed_frame
Unlistable Listsis_unlistable
Fast Table Joinsjoin
Determine the Depth / Level of Nesting of a Listldepth
List ProcessingA8-list-processing list-processing
Pad Matrix-Like Objects with a Valuepad
Fast and Easy Data Reshapingpivot
Auto- and Cross- Covariance and Correlation Function Estimation for Panel Seriespsacf psacf.default psacf.pdata.frame psacf.pseries psccf psccf.default psccf.pseries pspacf pspacf.default pspacf.pdata.frame pspacf.pseries
Matrix / Array from Panel Seriesaperm.psmat plot.psmat psmat psmat.default psmat.pdata.frame psmat.pseries [.psmat
(Pairwise, Weighted) Correlations, Covariances and Observation Countsprint.pwcor print.pwcov pwcor pwcov pwnobs
Fast Factor Generation, Interactions and Vector Groupingas_factor_qG finteraction is_qG itn qF qG
Fast (Grouped, Weighted) Summary Statistics for Cross-Sectional and Panel print.qsu qsu qsu.default qsu.grouped_df qsu.matrix qsu.pdata.frame qsu.pseries qsu.sf
Fast (Weighted) Cross Tabulationqtab qtable
Quick Data ConversionA4-quick-conversion as_character_factor as_integer_factor as_numeric_factor mctl mrtl qDF qDT qM qTBL quick-conversion
Fast Radix-Based Orderingradixorder radixorderv
Recursively Apply a Function to a List of Data Objectsrapply2d
Recode and Replace Values in Matrix-Like ObjectsAA1-recode-replace recode-replace recode_char recode_num replace_inf replace_na replace_outliers
Row-Bind Lists / Data Frame-Like Objectsrowbind
Fast Reordering of Data Frame Rowsroworder roworderv
Fast (Recursive) Splittingrsplit rsplit.default rsplit.matrix
Generate Group-Id from Integer Sequencesseqid
Small (Helper) Functions%=% .c AA3-small-helpers add_stub all_funs all_identical all_obj_equal copyAttrib copyMostAttrib is_categorical is_date massign namlab rm_stub setAttrib setattrib setColnames setDimnames setLabels setRownames small-helpers unattrib vclasses vlabels vlabels<-
Summary StatisticsA9-summary-statistics summary-statistics
Efficient List Transposet_list
Time Series and Panel SeriesA7-time-series-panel-series time-series-panel-series
Generate Integer-Id From Time/Date Sequencestimeid
Transform Data by (Grouped) Replacing or Sweeping out StatisticssetTRA TRA TRA.default TRA.grouped_df TRA.matrix
Recursive Row-Binding / Unlisting in 2D - to Data Frameunlist2d
Fast Check of Variation in Datavarying varying.default varying.grouped_df varying.matrix varying.pdata.frame varying.pseries varying.sf
World Development Datasetwlddev