Recommentation-Materials

Posted on 2021-12-22 In DataWhale , Recommentation
Symbols count in article: 14k Reading time ≈ 13 mins.

1. Introduction

本文属于新闻推荐实战-数据层-构建物料池之 scrapy 爬虫框架基础。对于开源的推荐系统来说数据的不断获取是非常重要的，scrapy 是一个非常易用且强大的爬虫框架，有固定的文件结构、类和方法，在实际使用过程中我们只需要按照要求实现相应的类方法，就可以完成我们的爬虫任务。文中给出了新闻推荐系统中新闻爬取的实战代码，以便读者可以快速掌握 scrapy 的基本使用方法，并能够举一反三。

Recommentation-Database

Posted on 2021-12-18 Edited on 2021-12-22 In DataWhale , Recommentation
Symbols count in article: 48k Reading time ≈ 43 mins.

1. Introduction

本文属于新闻推荐实战—数据层—构建物料池之 MySQL。MySQL 数据库在该项目中会用来存储结构化的数据（用户、新闻特征），作为算法工程师需要了解常用的 MySQL语法（比如增删改查，排序等），因为在实际的工作经常会用来统计相关数据或者抽取相关特征。本着这个目的，本文对 MySQL 常见的语法及 Python 操作 MySQL 进行了总结，方便大家快速了解。

Crawler-JD

Posted on 2021-10-02 In Project , Crawler
Symbols count in article: 24k Reading time ≈ 22 mins.

1. Introduciton

1.1 Job description

本项目为京东评论数据爬虫。随着电子商务的发展，有如京东、淘宝等网站，在线评论作为电子口碑显著影响着产品的营销策略。

Crawler-qcc

Posted on 2021-10-02 In Project , Crawler
Symbols count in article: 48k Reading time ≈ 43 mins.

1. Introduction

1.1 Job description

本项目为企查查注册企业信息爬取，项目来源是别人的实验需求。故本博客会对项目的具体数据进行脱敏处理，其中涉及的 1168 个链接本文不进行提供，也不提供成品数据。

test

Posted on 2021-09-01 Edited on 2023-03-28 In Notes , Markdown
Symbols count in article: 8.6k Reading time ≈ 8 mins.

1. Introduction

This is a test file!

Text-mining

Posted on 2021-09-01 In Thesis , Prediction
Symbols count in article: 31k Reading time ≈ 29 mins.

1. Introduction

投资者情绪等指数对石油价格（收益率）的预测（or predictability）；如投资者看涨、看跌的情绪.
投资者关注度。如对石油市场的关注度（可以借助谷歌指数）、对石油政策、绿色消费、碳中和等的关注度等；
基于数据挖掘对石油价格的预测。挖掘一些新的创意点对油价进行预测，初步拟定爬取新闻文本，之后借助自然语言处理分析投资者情绪，情感分析，投资者关注度等。

虽然研究的是油价预测，但油价其实只是一个载体，换成其他的商品处理逻辑也差不多，只是因为课题是能源金融，需要一个载体来契合这个点。

Pytorch-4-LanguageModel

Posted on 2021-08-06
Symbols count in article: 0 Reading time ≈ 1 mins.

Pytorch-3-Word2vec

Posted on 2021-08-05 Edited on 2022-01-21 In Notes , Python module , Pytorch
Symbols count in article: 32k Reading time ≈ 29 mins.

1. Study goals

学习词向量的概念
用 Skip-thought 模型训练词向量
学习使用 PyTorch dataset和 dataloader
学习定义 PyTorch 模型
学习 torch.nn 中常见的 Module
- Embedding
学习常见的 PyTorch operations
- bmm
- logsigmoid
保存和读取 PyTorch 模型

Pytorch-2-Autogradient

Posted on 2021-08-04 Edited on 2021-08-05 In Notes , Python module , Pytorch
Symbols count in article: 15k Reading time ≈ 13 mins.

1. 什么是PyTorch?

PyTorch是一个基于Python的科学计算库，它有以下特点:

类似于 NumPy，但是它可以使用 GPU
可以用它定义深度学习模型，可以灵活地进行深度学习模型的训练和使用

Pytorch-1-Introduction

Posted on 2021-08-03 Edited on 2021-08-05 In Notes , Python module , Pytorch
Symbols count in article: 3.4k Reading time ≈ 3 mins.

Statement: This series of post records the personal notes and experiences of learning the BiliBili video tutorial “Pytorch 入门学习”, most of code and pictures are from the courseware PyTorch-Course. All posted content is for personal study only, do not use for other purposes. If there is infringement, please contact e-mail:yangsuoly@qq.com to delete.