python_spider/spider_work @ 4d6ce1b619e56470e17a44665ec1f51a4ad43faf

lei.chen 4d6ce1b619 update ins cookies 6.13.3		hai 1 ano
..
README.md	fae6264064 update 6.13.1	hai 1 ano
YamlLoader.py	fae6264064 update 6.13.1	hai 1 ano
application.yml	fae6264064 update 6.13.1	hai 1 ano
ins_history_spider.py	4d6ce1b619 update ins cookies 6.13.3	hai 1 ano
ins_posts_spider.py	4d6ce1b619 update ins cookies 6.13.3	hai 1 ano
mysql_pool.py	fae6264064 update 6.13.1	hai 1 ano
requirements.txt	fae6264064 update 6.13.1	hai 1 ano

		
			
			
				README.md
			
		
		
	
			
				1. [主程序] 调用 get_userPosts('fanatics') 创建生成器
2. [生成器激活] 进入 get_userPosts 函数：
   ├─ 2.1 初始化分页参数 continuations = [{'count':'12'}]
   ├─ 2.2 构建初始 URL：https://i.instagram.com/api/v1/feed/user/fanatics/username/
   └─ → 暂停等待迭代

3. [首次迭代] 主程序执行 for item_ in items:
   ├─ 3.1 get_userPosts 恢复执行：
   │   ├─ 弹出分页参数 {'count':'12'}
   │   ├─ 发送 API 请求获取第一页数据
   │   └─ → 若失败：产出错误信息 → 结束
   │
   ├─ 3.2 处理响应：
   │   ├─ 检查 resp.user 是否存在
   │   ├─ → 不存在：产出错误 → 结束
   │   └─ → 存在：
   │       ├─ 提取 _items = resp.items
   │       ├─ 检查 more_available：
   │       │   ├─ 是：添加 {'count':'12', 'max_id':next_max_id} 到 continuations
   │       │   └─ 更新 temp 为用户ID
   │       └─ 执行 yield from extract_post(_items)
   │
   └─ 3.3 extract_post 开始处理：
       ├─ 遍历每个 post：
       │   ├─ 创建基础 item 字典
       │   ├─ 根据 media_type 添加媒体URL：
       │   │   ├─ 类型1(单图)：item.photo = 图片URL
       │   │   ├─ 类型2(视频)：item.video = 视频URL
       │   │   └─ 类型8(轮播)：item.photo=[多图], item.video=首视频
       │   └─ 产出 item → 主程序打印
       └─ 全部处理完后 → 返回 get_userPosts

4. [后续迭代] 主程序继续循环：
   ├─ 4.1 检查 continuations：
   │   ├─ 为空 → 结束迭代
   │   └─ 非空 → 重复步骤3.1-3.3获取下一页
   │
   └─ 4.2 自动分页示例：
       ├─ 第一页：max_id=null → 获取前12条
       ├─ 第二页：max_id=xxx → 获取下12条
       └─ 直到 more_available=False