有一个英语文本,需要完成以下功能: 输出(以逗号区分): 提取的电子邮件: 提取的网址: 长度为3的单词: 以’ing’结尾的单词: 连续的 URL: URL 参数名和值: 动词(以 -ed 或 -ing 结尾的单词): 用Python编写出完整的代码输出结果
时间: 2024-09-25 15:14:26 浏览: 28
要实现这些功能,我们需要使用正则表达式库`re`来匹配电子邮件、网址、特定长度的单词和URL参数,以及一些基本的字符串操作来找出以特定后缀结束的单词和连续的URL。这里是一个示例代码:
```python
import re
# 假设我们有一个英文文本
text = """
Hello, my email is john@example.com and website www.example.com/path?param1=value1¶m2=value2.
Here are some words: running, jumping, singing, coding, finished.
Check out these URLs:
https://example.com, https://other.example/path?query=param.
"""
# 提取电子邮件
email_pattern = r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'
emails = re.findall(email_pattern, text)
# 提取网址
url_pattern = r'(http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+)'
urls = re.findall(url_pattern, text)
# 提取长度为3的单词
three_letter_words = re.findall(r'\b[a-zA-Z]{3}\b', text)
# 提取以'ing'结尾的单词
ending_ing_words = re.findall(r'\b\w+ing\b', text)
# 查找连续的URL
consecutive_urls = re.findall(r'(https?:\/\/.*?)(?=\s|$)', text, re.DOTALL)
# 解析URL参数
url_params = {}
params = re.findall(r'(\w+)\=(\w+)', text)
for param in params:
url_params[param[0]] = param[1]
# 提取动词(以 -ed 或 -ing 结尾的单词)
verbs = set()
for word in three_letter_words + ending_ing_words:
if word.endswith('ed') or word.endswith('ing'):
verbs.add(word)
output = f"Emails: {', '.join(emails)}, URLs: {', '.join(urls)}, " \
f"Three-letter words: {', '.join(three_letter_words)}, " \
f"Ing-ending words: {', '.join(ending_ing_words)}, " \
f"Consecutive URLs: {', '.join(consecutive_urls)}, " \
f"Parameters: {url_params}, Verbs: {', '.join(sorted(verbs))}"
print(output)
```