Janice
2023-04-29 17:27老師您好,請(qǐng)問(wèn)第三問(wèn) 題目中提到 preparation of the textual data,為什么老師默認(rèn)是cleasing,我理解preparation 包含了cleaning & wrangling,為什么這里wrangling包含的部分就不是題目的答案呢?
所屬:CFA Level II > Quantitative Methods 視頻位置 相關(guān)試題
來(lái)源: 視頻位置 相關(guān)試題
1個(gè)回答
Vincent助教
2023-05-01 20:57
該回答已被題主采納
你好
書上把這個(gè)知識(shí)點(diǎn)叫做DATA PREPARATION AND WRANGLING,preparation是指cleansing, 不包含wrangling。
還有書上也給了具體解釋:
Data Preparation (Cleansing): This is the initial and most common task in data
preparation that is performed on raw data. Data cleansing is the process of examining,
identifying, and mitigating errors in raw data. Normally, the raw data are
neither sufficiently complete nor sufficiently clean to directly train the ML model.
Manually entered data can have incomplete, duplicated, erroneous, or inaccurate
values. Automated data (recorded by systems) can have similar problems due to server
failures and software bugs.
Data Wrangling (Preprocessing): This task performs transformations and critical
processing steps on the cleansed data to make the data ready for ML model training.
Raw data most commonly are not present in the appropriate format for model consumption.
After the cleansing step, data need to be processed by dealing with outliers,
extracting useful variables from existing data points,
