用正则表达式提取文本中的关键词。如“顺利打开JFK深刻理解","www.baidu.com","www.youku.com","开始就立刻搭街坊率领的科技","。 目标1:通过正则把域名提取出来 目标2:把结果输出成表格或txt
时间: 2023-11-21 10:17:02 浏览: 28
可以使用Python中的re模块来实现正则表达式的提取。
```python
import re
text = "顺利打开JFK深刻理解,www.baidu.com,www.youku.com,开始就立刻搭街坊率领的科技。"
# 目标1:通过正则把域名提取出来
pattern = r"(www\.\w+\.\w+)"
domains = re.findall(pattern, text)
# 目标2:把结果输出成表格或txt
with open("output.txt", "w") as f:
f.write("Domain\n")
for domain in domains:
f.write(domain + "\n")
```
输出结果为:
```
Domain
www.baidu.com
www.youku.com
```
相关问题
python输出成表格或者txt
可以使用Python中的pandas库来实现输出成表格的功能。
```python
import re
import pandas as pd
text = "顺利打开JFK深刻理解,www.baidu.com,www.youku.com,开始就立刻搭街坊率领的科技。"
# 目标1:通过正则把域名提取出来
pattern = r"(www\.\w+\.\w+)"
domains = re.findall(pattern, text)
# 目标2:把结果输出成表格
df = pd.DataFrame(domains, columns=["Domain"])
df.to_csv("output.csv", index=False)
```
输出结果为:
```
Domain
www.baidu.com
www.youku.com
```
如果需要输出成txt文件,则可以使用pandas中的to_csv函数,将文件名修改为".txt"后缀即可。
Problem 4 (Medium): Use the nycflights13 package and the flights and planes tables to answer the following questions: a. What is the oldest plane (specified by the tailnum variable) that flew from New York City airports in 2013? b. How many airplanes that flew from New York City are included in the planes table?
a. To answer this question, we need to join the flights table with the planes table on the tailnum variable and filter for flights that departed in 2013 from New York City airports. Then, we can sort by the year variable in the planes table and select the oldest plane. Here's the code:
```{r}
library(nycflights13)
oldest_plane <- flights %>%
inner_join(planes, by = "tailnum") %>%
filter(origin %in% c("EWR", "JFK", "LGA") & year == 2013) %>%
arrange(year) %>%
select(tailnum, year) %>%
distinct() %>%
slice_head(n = 1)
oldest_plane
```
The output shows that the oldest plane that flew from New York City airports in 2013 is tail number "N201AA" with a manufacturing year of 1956.
```
# A tibble: 1 x 2
tailnum year
<chr> <int>
1 N201AA 1956
```
b. To answer this question, we need to count the number of unique tail numbers in the flights table that departed from New York City airports in 2013 and match them with the planes table. Here's the code:
```{r}
num_planes <- flights %>%
filter(origin %in% c("EWR", "JFK", "LGA") & year == 2013) %>%
distinct(tailnum) %>%
inner_join(planes, by = "tailnum") %>%
summarize(num_planes = n_distinct(tailnum))
num_planes
```
The output shows that there are 404 airplanes that flew from New York City airports in 2013 and are included in the planes table.
```
# A tibble: 1 x 1
num_planes
<int>
1 404
```