2023-06-06 14:02:21,310 - INFO - running WikiExtractor.py: parse the chinese corpus D:\软件\python\lib\site-packages\gensim\utils.py:1333: UserWarning: detected Windows; aliasing chunkize to chunkize_serial warnings.warn("detected %s; aliasing chunkize to chunkize_serial" % entity) Traceback (most recent call last): File "D:\pythonFiles\图灵\Python_project\self_learn\大语言模型\WikiExtractor.py", line 52, in <module> parse_corpus(infile, outfile) File "D:\pythonFiles\图灵\Python_project\self_learn\大语言模型\WikiExtractor.py", line 29, in parse_corpus for text in wiki.get_texts(): File "D:\软件\python\lib\site-packages\gensim\corpora\wikicorpus.py", line 693, in get_texts for tokens, title, pageid in pool.imap(_process_article, group): File "D:\软件\python\lib\multiprocessing\pool.py", line 870, in next raise value File "D:\软件\python\lib\multiprocessing\pool.py", line 537, in _handle_tasks put(task) File "D:\软件\python\lib\multiprocessing\connection.py", line 211, in send self._send_bytes(_ForkingPickler.dumps(obj)) File "D:\软件\python\lib\multiprocessing\reduction.py", line 51, in dumps cls(buf, protocol).dump(obj) AttributeError: Can't pickle local object 'parse_corpus.<locals>.tokenize' 怎么解决
时间: 2023-08-24 11:09:09 浏览: 45
这个错误是因为在使用多进程时,无法将本地对象pickle化,建议尝试将函数`parse_corpus`中的`tokenize`函数定义在`parse_corpus`函数外,并将其作为参数传递到`parse_corpus`函数中。或者可以使用`pathos.multiprocessing`替代Python自带的`multiprocessing`模块,因为`pathos.multiprocessing`可以pickle化本地函数。
相关问题
java2023-06-06T09:47:10.87+08:00转换为时间戳
将该时间字符串转换成时间戳需要先将其转换成对应的时间对象,再调用时间对象的 `getTime()` 方法即可。以下是示例代码:
```java
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.TimeZone;
public class Main {
public static void main(String[] args) throws Exception {
String timeStr = "2023-06-06T09:47:10.87+08:00";
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX");
sdf.setTimeZone(TimeZone.getTimeZone("GMT+8"));
Date date = sdf.parse(timeStr);
long timestamp = date.getTime();
System.out.println(timestamp);
}
}
```
输出结果为:
```
1686092830870
```
其中,输出的时间戳单位为毫秒,表示从 1970 年 1 月 1 日零点零分零秒起,到指定时间点所经过的毫秒数。
python 2023-05-13 18:00:00 2023-05-12 06:00:00 日期差值
这里有两种方法可以计算这两个日期之间的差值:
方法一:使用datetime模块
```python
from datetime import datetime
date1 = datetime(2023, 5, 13, 18, 0, 0)
date2 = datetime(2023, 5, 12, 6, 0, 0)
delta = date1 - date2
print(delta.days)
```
输出结果为:
```
1
```
方法二:使用dateutil模块
```python
from dateutil import parser
date1 = parser.parse("2023-05-13 18:00:00")
date2 = parser.parse("2023-05-12 06:00:00")
delta = date1 - date2
print(delta.days)
```
输出结果同样为:
```
1
```