# Playwright 安装指南 ## 快速安装 ### 步骤 1: 安装 Playwright Python 包 ```bash pip install playwright ``` 或者如果使用 conda 环境: ```bash conda install -c conda-forge playwright ``` ### 步骤 2: 安装浏览器驱动 安装 Chromium 浏览器(推荐,体积较小): ```bash playwright install chromium ``` 或者安装所有浏览器(Chrome、Firefox、WebKit): ```bash playwright install ``` ### 步骤 3: 验证安装 运行以下命令验证安装是否成功: ```bash python -c "from playwright.sync_api import sync_playwright; print('✅ Playwright 安装成功!')" ``` --- ## 详细说明 ### 安装方式选择 #### 方式 1: 使用 pip(推荐) ```bash # 安装 Playwright pip install playwright # 安装浏览器驱动(只需要 Chromium) playwright install chromium ``` #### 方式 2: 使用 uv(如果项目使用 uv) ```bash # 安装 Playwright uv pip install playwright # 安装浏览器驱动 playwright install chromium ``` #### 方式 3: 使用 conda ```bash # 安装 Playwright conda install -c conda-forge playwright # 安装浏览器驱动(仍然需要使用 playwright 命令) playwright install chromium ``` --- ## 浏览器驱动选择 ### Chromium(推荐,约 170MB) ```bash playwright install chromium ``` - ✅ 体积最小 - ✅ 启动最快 - ✅ 兼容性最好 - ✅ 适合无头模式(headless) ### Chrome(约 200MB) ```bash playwright install chrome ``` - ✅ 使用真实的 Chrome 浏览器 - ❌ 体积较大 ### Firefox(约 100MB) ```bash playwright install firefox ``` - ✅ 体积较小 - ❌ 启动稍慢 ### WebKit(约 100MB) ```bash playwright install webkit ``` - ✅ 体积较小 - ❌ 兼容性可能不如 Chromium ### 安装所有浏览器 ```bash playwright install ``` - ❌ 体积很大(约 500MB+) - ✅ 可以测试不同浏览器 --- ## 在无图形界面的服务器上安装 如果是在 Linux 服务器上(没有图形界面),需要安装系统依赖: ### Ubuntu/Debian ```bash # 安装系统依赖 sudo apt-get update sudo apt-get install -y \ libnss3 \ libnspr4 \ libatk1.0-0 \ libatk-bridge2.0-0 \ libcups2 \ libdrm2 \ libdbus-1-3 \ libxkbcommon0 \ libxcomposite1 \ libxdamage1 \ libxfixes3 \ libxrandr2 \ libgbm1 \ libasound2 # 安装 Playwright pip install playwright playwright install chromium ``` ### CentOS/RHEL ```bash # 安装系统依赖 sudo yum install -y \ nss \ nspr \ atk \ at-spi2-atk \ cups-libs \ libdrm \ dbus-glib \ libxkbcommon \ libXcomposite \ libXdamage \ libXfixes \ libXrandr \ mesa-libgbm \ alsa-lib # 安装 Playwright pip install playwright playwright install chromium ``` ### Alibaba Cloud Linux Alibaba Cloud Linux 3 (OpenAnolis Edition) 基于 CentOS/RHEL,可以使用相同的命令: ```bash # 安装系统依赖(使用 yum 或 dnf,取决于系统版本) sudo yum install -y \ nss \ nspr \ atk \ at-spi2-atk \ cups-libs \ libdrm \ dbus-glib \ libxkbcommon \ libXcomposite \ libXdamage \ libXfixes \ libXrandr \ mesa-libgbm \ alsa-lib # 如果 yum 不可用,尝试使用 dnf # sudo dnf install -y nss nspr atk at-spi2-atk cups-libs libdrm dbus-glib libxkbcommon libXcomposite libXdamage libXfixes libXrandr mesa-libgbm alsa-lib # 安装 Playwright pip install playwright playwright install chromium ``` **注意**:安装时可能会看到 "BEWARE: your OS is not officially supported" 的警告,这是正常的。Playwright 会使用兼容的备用构建,功能不受影响。 --- ## 验证安装 ### 方法 1: Python 导入测试 ```bash python -c "from playwright.sync_api import sync_playwright; print('✅ Playwright 安装成功!')" ``` ### 方法 2: 运行简单测试 创建测试文件 `test_playwright.py`: ```python from playwright.sync_api import sync_playwright with sync_playwright() as p: browser = p.chromium.launch(headless=True) page = browser.new_page() page.goto('https://xueqiu.com/') print(f'✅ 页面标题: {page.title()}') browser.close() print('✅ Playwright 工作正常!') ``` 运行: ```bash python test_playwright.py ``` ### 方法 3: 测试 Cookie 获取功能 ```bash python -c " from src.investment_team.utils.etf_data import _get_xueqiu_cookies cookies = _get_xueqiu_cookies() if cookies: print(f'✅ 成功获取 {len(cookies)} 个 cookie') print(f'Cookie 名称: {list(cookies.keys())[:5]}') else: print('⚠️ 未获取到 cookie(可能是网络问题)') " ``` --- ## 常见问题 ### Q1: `playwright install` 命令找不到 **问题**: 安装 playwright 后,运行 `playwright install` 提示命令不存在。 **解决方案**: 1. 检查是否在正确的 Python 环境中: ```bash which python which pip ``` 2. 确保 playwright 安装在当前环境: ```bash pip show playwright ``` 3. 如果使用虚拟环境,确保已激活: ```bash source venv/bin/activate # Linux/Mac # 或 venv\Scripts\activate # Windows ``` 4. 使用 Python 模块方式安装: ```bash python -m playwright install chromium ``` ### Q2: 安装浏览器驱动时下载很慢 **解决方案**: 1. 使用国内镜像(如果可用): ```bash export PLAYWRIGHT_DOWNLOAD_HOST=https://npmmirror.com/mirrors/playwright playwright install chromium ``` 2. 或者手动下载并安装(参考 Playwright 官方文档) ### Q2.5: 安装时出现 "BEWARE: your OS is not officially supported" 警告 **问题**: 安装时看到类似以下警告: ``` BEWARE: your OS is not officially supported by Playwright; downloading fallback build for ubuntu24.04-x64. ``` **说明**: 这是**正常现象**,不是错误! - Playwright 可能尚未正式支持某些较新的 Linux 发行版(如 Ubuntu 24.04、Alibaba Cloud Linux 3 等) - Playwright 会自动下载兼容的备用构建版本 - 功能完全正常,不影响使用 **验证安装是否成功**: ```bash python -c "from playwright.sync_api import sync_playwright; print('✅ Playwright 安装成功!')" ``` 如果看到成功消息,说明安装正常,可以忽略警告。 ### Q3: 在服务器上运行失败(无图形界面) **解决方案**: 1. 确保安装了系统依赖(见上面的"在无图形界面的服务器上安装") 2. 使用 headless 模式(代码中已默认使用): ```python browser = p.chromium.launch(headless=True) ``` ### Q4: 内存不足 **问题**: 在内存较小的服务器上,Playwright 可能无法启动。 **解决方案**: 1. 使用更轻量的浏览器(Firefox 或 WebKit) 2. 增加交换空间(swap) 3. 考虑使用手动配置 cookie 的方案(方案 1) ### Q5: 权限问题 **问题**: 安装浏览器驱动时提示权限不足。 **解决方案**: ```bash # 使用用户权限安装(不需要 sudo) playwright install chromium # 或者指定安装路径 export PLAYWRIGHT_BROWSERS_PATH=~/.local/share/ms-playwright playwright install chromium ``` --- ## 卸载 Playwright 如果需要卸载: ```bash # 卸载 Python 包 pip uninstall playwright # 删除浏览器驱动(可选,释放空间) # 浏览器驱动通常存储在 ~/.cache/ms-playwright 或类似位置 rm -rf ~/.cache/ms-playwright ``` --- ## 性能优化 ### 1. 只安装需要的浏览器 只安装 Chromium(最小安装): ```bash playwright install chromium ``` ### 2. 使用持久化上下文 如果频繁使用,可以创建持久化浏览器上下文,避免每次启动: ```python from playwright.sync_api import sync_playwright with sync_playwright() as p: # 创建持久化上下文(会保存 cookie、localStorage 等) context = p.chromium.launch_persistent_context( user_data_dir='./browser_data', headless=True ) # ... 使用 context ``` ### 3. 复用浏览器实例 在代码中,Playwright 会为每次调用创建新的浏览器实例。如果需要频繁获取 cookie,可以考虑: - 使用手动配置 cookie(方案 1)- 性能最好 - 或者缓存浏览器实例(需要修改代码) --- ## 相关资源 - [Playwright 官方文档](https://playwright.dev/python/) - [Playwright Python API](https://playwright.dev/python/docs/api/class-playwright) - [安装问题排查](https://playwright.dev/python/docs/troubleshooting) --- ## 快速命令参考 ```bash # 安装 pip install playwright playwright install chromium # 验证 python -c "from playwright.sync_api import sync_playwright; print('OK')" # 查看已安装的浏览器 playwright install --help # 更新浏览器驱动 playwright install --force chromium ```