Day3 DS review - import datasets, programming with dplyr
好
我们要来import data 了
这个是utils,自带的,啥也不用装
. 系列
基本上看看read.csv 和 read.csv2 的區別,我也说不上咋区别了,但是吧,就是有区别
data:image/s3,"s3://crabby-images/bf4aa/bf4aa2b58f22a7f7fa2458d3cb3c448c92acd986" alt=""
readr 一生推?
data:image/s3,"s3://crabby-images/1bf42/1bf42865260aab2195702a909e09ec1f52adc00c" alt=""
data:image/s3,"s3://crabby-images/782a9/782a97de0e8f8077028599b4c2fd89979b48b387" alt=""
data:image/s3,"s3://crabby-images/ff1ee/ff1ee36d2d9d20020bd51a96e1a862f1385b847e" alt=""
传说中的fread
data:image/s3,"s3://crabby-images/34701/34701529a3a05a6e2721e6aa890a674a961f5a5f" alt=""
好像带hypfen的都比较不错, 但是要load readxl
data:image/s3,"s3://crabby-images/07f19/07f19147e1191f3873b5eabe60c47c17aa368d87" alt=""
data:image/s3,"s3://crabby-images/09b2f/09b2f77f5f69cace3414f0a2bb58eb89e2aae55b" alt=""
辣鸡的gdata
data:image/s3,"s3://crabby-images/e8caf/e8cafa8cdbc5f5091b887f6485f124b65667df29" alt=""
data:image/s3,"s3://crabby-images/abe81/abe8127b0e4d0506bc29746317d70f731c73ed47" alt=""
gdata 听说很垃圾
data:image/s3,"s3://crabby-images/49e40/49e40bd41f77b24dff66223818258d52e6561b9c" alt=""
XLconnect 搞Excel - 把RG的那个list 用r整理出能用的list 可能会用上XLconnect
data:image/s3,"s3://crabby-images/dfdac/dfdacc9a1bc8d7ccec44d61839ba0166b4f7473f" alt=""
data:image/s3,"s3://crabby-images/901ed/901edc745ea0d0719a93c00a81b07a86a20c2fc7" alt=""
data:image/s3,"s3://crabby-images/219f6/219f6d7a8305396b64a20f7a8c0ace75396a7d07" alt=""
data:image/s3,"s3://crabby-images/45ee0/45ee088ca43f724be04d3aa8ae43a354fcf76ed1" alt=""
data:image/s3,"s3://crabby-images/9efe9/9efe916a9fa1a85f2441192aad64c295c09ed9bb" alt=""
data:image/s3,"s3://crabby-images/03f92/03f9248a41f957a87225900c61f9695c29d3de9d" alt=""
data:image/s3,"s3://crabby-images/83741/83741e67c26143b0ee08fa3b54f6d92cb54059e1" alt=""
data:image/s3,"s3://crabby-images/b61c2/b61c28597e56bae16c66b223adb27c9305a6024a" alt=""
ggplot breaks 的用法很特殊,特此一记
data:image/s3,"s3://crabby-images/fc59c/fc59c623a7df20292ed12eb68718cddc28891e47" alt=""
replace 还有missing value 记得有一个课讲missing value 讲得特别好,但是我忘了是啥课,好像是和sentiment analysis有关的,可能是tidy verse toolbox
data:image/s3,"s3://crabby-images/437ef/437ef018348fd5da56914011148ad83c5bbb3755" alt=""
no output=passed
data:image/s3,"s3://crabby-images/d14a7/d14a79ccd2fb52e9830173f9549218309a867af8" alt=""
但是要用filter
data:image/s3,"s3://crabby-images/3e9bd/3e9bd08582ff41bd0212cc83889ed58f0e2a7358" alt=""
dropping full duplicates 好简单哈哈哈哈就是个distinct,但是data.table里好像有更复杂的解
data:image/s3,"s3://crabby-images/ac16a/ac16ab6c7d2b7d7155b03ea8c4bcb52aee2cc159" alt=""
找重复 也有万金油
data:image/s3,"s3://crabby-images/da24c/da24c15363ed5ef25a2f08ec7d147beb73fae86e" alt=""
万金油呵呵
data:image/s3,"s3://crabby-images/de35c/de35c6ca0d4d60d9f1a223e464508c9d1aeb5eb8" alt=""
dropping partial duplicates 其实是有万金油的
semi-join就是除怪,把怪踢出去
data:image/s3,"s3://crabby-images/ff34a/ff34ab661609ca3a20129a96fb9fad8967cf1b8a" alt=""
anti-join 是找怪,把怪单领出来
data:image/s3,"s3://crabby-images/5b168/5b168432a641370950040a90fb471b51083b6455" alt=""
这个是和factor 有关的,data cleaning这门课说得很清楚。但是知识很杂,基本上每一章都可以另成一本书了
感觉到最后做project的时候会很管用,就是查level有没有啥奇奇怪怪不符合逻辑的归类
data:image/s3,"s3://crabby-images/0c388/0c38858d57a097d3c816c763acde17185c33bb5e" alt=""
感觉stringr应该重新开一个文章。。。
整个str家族都很牛逼,还能detect 哈哈哈
这个filter的用法也很神奇,啥也不写,就assume了logically TRUE
data:image/s3,"s3://crabby-images/59554/59554ae62e944e613431affb4c6478d3ed7c45f7" alt=""
感觉如果不说_all的话,可能就只改第一个见到的
data:image/s3,"s3://crabby-images/b8e1f/b8e1fb76c7b98be2590630c14621b05dbf6bbb98" alt=""
data:image/s3,"s3://crabby-images/b04b0/b04b02c8795a8eb7459d1118167e03d19f809cec" alt=""
这个有点傻的知识点,只是想remind myself 一下
data:image/s3,"s3://crabby-images/c227e/c227e1cb557bb2a08ae95c2db2006478b26ad710" alt=""
课programming with dpyr, 我觉得这个课教一些tidyverse dplyr 比较高阶的常识
真的,像什么across,还有前面带点的那些东西,要慢慢适应,不要怕
.keep = "used" 只会让那些用到过的columns 出现
.keep 的default 是全部
across 其实就是同时apply to multiple columns
data:image/s3,"s3://crabby-images/c02fa/c02fab5ab0b246245f404c1aed903d75f62e4147" alt=""
感觉sub 是stringr里的,整个string manipulation 其实和NPL还有sentiment analysis是相关的。sentiment analysis 就是数褒义词和贬义词。
data:image/s3,"s3://crabby-images/219c8/219c84e57b1b627b8edc27ff505c9d852f778508" alt=""
喜欢我的作品吗?别忘了给予支持与赞赏,让我知道在创作的路上有你陪伴,一起延续这份热忱!
- 来自作者
- 相关推荐