如何在 google colab 上运行“run_squad.py”？它给出了“无效语法”错误（谷歌colab使用）

25-02-24 12

对于想了解如何在googlecolab上运行“run_squad.py”？它给出了“无效语法”错误的读者，本文将是一篇不可错过的文章，我们将详细介绍谷歌colab使用，并且为您提供关于Dataload

对于想了解如何在 google colab 上运行“run_squad.py”？它给出了“无效语法”错误的读者，本文将是一篇不可错过的文章，我们将详细介绍谷歌colab使用，并且为您提供关于Dataloader Worker 在 Visual Studio 上运行时意外退出但在 Google Colab 上运行正常、download file by python in google colab、google colab 中超出了 IOPub 数据速率、Google Colab在“我的云端硬盘”文件夹之外的文件夹上运行的有价值信息。

本文目录一览：

如何在 google colab 上运行“run_squad.py”？它给出了“无效语法”错误（谷歌colab使用）
Dataloader Worker 在 Visual Studio 上运行时意外退出但在 Google Colab 上运行正常
download file by python in google colab
google colab 中超出了 IOPub 数据速率
Google Colab在“我的云端硬盘”文件夹之外的文件夹上运行

如何在 google colab 上运行“run_squad.py”？它给出了“无效语法”错误（谷歌colab使用）

如何解决如何在 google colab 上运行“run_squad.py”？它给出了“无效语法”错误？

我首先使用以下方法下载了文件：

!curl -L -O https://github.com/huggingface/transformers/blob/master/examples/legacy/question-answering/run_squad.py

然后使用以下代码：

!python run_squad.py  \
    --model_type bert   \
    --model_name_or_path bert-base-uncased  \
    --output_dir models/bert/ \
    --data_dir data/squad   \
    --overwrite_output_dir \
    --overwrite_cache \
    --do_train  \
    --train_file /content/train.json   \
    --version_2_with_negative \
    --do_lower_case  \
    --do_eval   \
    --predict_file /content/val.json   \
    --per_gpu_train_batch_size 2   \
    --learning_rate 3e-5   \
    --num_train_epochs 2.0   \
    --max_seq_length 384   \
    --doc_stride 128   \
    --threads 10   \
    --save_steps 5000

还尝试了以下操作：

!python run_squad.py \
  --model_type bert \
  --model_name_or_path bert-base-cased \
  --do_train \
  --do_eval \
  --do_lower_case \
  --train_file /content/train.json \
  --predict_file /content/val.json \
  --per_gpu_train_batch_size 12 \
  --learning_rate 3e-5 \
  --num_train_epochs 2.0 \
  --max_seq_length 584 \
  --doc_stride 128 \
  --output_dir /content/

错误在两个代码中都说：

文件“run_squad.py”，第 7 行 ^ 语法错误：无效语法

究竟是什么问题？如何运行 .py 文件？

解决方法

已解决：出现错误，因为我下载的是 github 链接而不是 github 中的脚本。一旦我复制并使用“原始”链接下载脚本，代码就会运行。

Dataloader Worker 在 Visual Studio 上运行时意外退出但在 Google Colab 上运行正常

如何解决Dataloader Worker 在 Visual Studio 上运行时意外退出但在 Google Colab 上运行正常？

所以我有这个数据加载器，它从 hdf5 加载数据，但是当我使用 num_workers>0 时意外退出（它在 0 时工作正常）。更奇怪的是，它适用于 google colab 上的更多工作人员，但不适用于我的计算机。我的电脑出现以下错误：

回溯（最近一次调用最后一次）：文件“C:\Users\Flavio Maia\AppData\Roaming\Python\python37\site-packages\torch\utils\data\DataLoader.py”，第 986 行，在 _try_get_data 数据 = self._data_queue.get(timeout=timeout) 文件“C:\Program Files (x86)\Microsoft Visual Studio\Shared\python37_64\lib\multiprocessing\queues.py”，第 105 行，在 get 提高空 _queue.Empty

上述异常是以下异常的直接原因：

回溯（最近一次调用最后一次）：文件“”，第 2 行，在文件“C:\Users\Flavio Maia\AppData\Roaming\Python\python37\site-packages\torch\utils\data\DataLoader.py”，第 517 行，下一步 数据 = self._next_data() 文件“C:\Users\Flavio Maia\AppData\Roaming\Python\python37\site-packages\torch\utils\data\DataLoader.py”，第 1182 行，在 _next_data 中 idx,数据 = self._get_data() 文件“C:\Users\Flavio Maia\AppData\Roaming\Python\python37\site-packages\torch\utils\data\DataLoader.py”，第 1148 行，在 _get_data 中成功，数据 = self._try_get_data() 文件“C:\Users\Flavio Maia\AppData\Roaming\Python\python37\site-packages\torch\utils\data\DataLoader.py”，第 999 行，在 _try_get_data 从 e 引发 RuntimeError(''DataLoader worker (pid(s) {}) 意外退出''.format(pids_str)) RuntimeError: DataLoader worker (pid(s) 12332) 意外退出

另外，我的getitem函数是：

def __getitem__(self,index):
  desired_file = int(index/self.file_size)
  position = index % self.file_size 

  h5_file = h5py.File(self.files[desired_file],''r'')

  image = h5_file[''Screenshots''][position]
  rect = h5_file[''Rectangles''][position]
  numb = h5_file[''Numbers''][position]

  h5_file.close()

  image = torch.from_numpy(image).float() 
  rect = torch.from_numpy(rect).float() 
  numb = torch.from_numpy( np.asarray(numb) ).float()


  return (image,rect,numb)

有谁知道是什么导致了这个空队列？

解决方法

Windows 无法处理 num_workers > 0 。您可以将其设置为 0，这很好。什么也应该起作用：将所有训练/测试脚本放在 train/test() 函数中，并在 if __name__ == "__main__": 下调用它例如像这样：

class MyDataLoder(torch.utils.data.Dataset):
    train_set = create_dataloader()
    . . . 

def train():
    test_set = create_dataloader()
    . . .

def test():
    . . .

if __name__ == "__main__":
    train()
    test()

download file by python in google colab

https://stackoverflow.com/questions/15352668/download-and-decompress-gzipped-file-in-memory

You need to seek to the beginning of compressedFile after writing to it but before passing it to gzip.GzipFile(). Otherwise it will be read from the end by gzip module and will appear as an empty file to it. See below:

#! /usr/bin/env python
import urllib2
import StringIO
import gzip

baseURL = "https://www.kernel.org/pub/linux/docs/man-pages/"
filename = "man-pages-3.34.tar.gz"
outFilePath = "man-pages-3.34.tar"

response = urllib2.urlopen(baseURL + filename)
compressedFile = StringIO.StringIO()
compressedFile.write(response.read())
#
# Set the file''s current position to the beginning
# of the file so that gzip.GzipFile can read
# its contents from the top.
#
compressedFile.seek(0)

decompressedFile = gzip.GzipFile(fileobj=compressedFile, mode=''rb'')

with open(outFilePath, ''w'') as outfile:
    outfile.write(decompressedFile.read())

https://stackoverflow.com/questions/11914472/stringio-in-python3

google colab 中超出了 IOPub 数据速率

如何解决google colab 中超出了 IOPub 数据速率？

我正在尝试在 google colab 中打开一个 32MB .csv 文件，但出现此错误：

df = pd.read_csv(''2004-1_CA.csv'',encoding=''latin'',error_bad_lines=False)

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit,set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)

我已经搜索过同样的问题，但我发现的所有内容都与下载文件有关，而不是阅读文件。该文件已上传到 colab，因此我目前找到的解决方案对我不起作用。感谢您的帮助

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

Google Colab在“我的云端硬盘”文件夹之外的文件夹上运行

如何解决Google Colab在“我的云端硬盘”文件夹之外的文件夹上运行？

我正在上在线课程，我使用Colab进行作业。初始化代码已提供给我，因此我所需要做的就是运行它并开始编写代码。

问题是由于某种原因，Colab决定分离一个文件夹-“ cs231n” 。我仍然可以在左侧树中的“我的驱动器”下看到此文件夹，但是从.ipynb笔记本中调用我在最下面的那个文件夹中写的函数（我写为.py文件）时，什么也没有发生。例如.py文件不存在。

但是当在单独的文件夹（不在“我的云端硬盘”（位于上方）中）中编写代码时，.ipynb可以正常工作。

所以我要用上层cs231n分隔的文件夹来完成作业，但这是一种扭曲的方式...

驱动器中的实际路径： path

Colab初始化和文件夹树： colab

谢谢：）

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

今天关于如何在 google colab 上运行“run_squad.py”？它给出了“无效语法”错误和谷歌colab使用的讲解已经结束，谢谢您的阅读，如果想了解更多关于Dataloader Worker 在 Visual Studio 上运行时意外退出但在 Google Colab 上运行正常、download file by python in google colab、google colab 中超出了 IOPub 数据速率、Google Colab在“我的云端硬盘”文件夹之外的文件夹上运行的相关知识，请在本站搜索。

本文标签：