用C# 爬取http://www.cnena.com/brand/list-htm-fid-9.html 的企业数据
时间: 2023-12-21 16:07:02 浏览: 43
好的,以下是用 C# 爬取 http://www.cnena.com/brand/list-htm-fid-9.html 的企业数据的示例代码:
```csharp
using System;
using System.Collections.Generic;
using System.Net.Http;
using System.Text.RegularExpressions;
using System.Threading.Tasks;
using HtmlAgilityPack;
namespace Crawler
{
class Program
{
static async Task Main(string[] args)
{
string url = "http://www.cnena.com/brand/list-htm-fid-9.html";
HttpClient httpClient = new HttpClient();
HttpResponseMessage response = await httpClient.GetAsync(url);
string content = await response.Content.ReadAsStringAsync();
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(content);
List<Dictionary<string, string>> companies = new List<Dictionary<string, string>>();
foreach (HtmlNode item in doc.DocumentNode.SelectNodes("//div[@class='company']"))
{
string name = item.SelectSingleNode("./h3").InnerText;
string link = item.SelectSingleNode("./a").GetAttributeValue("href", "");
string address = Regex.Match(item.InnerText, @"地址:(.*)").Groups[1].Value;
string tel = Regex.Match(item.InnerText, @"电话:(.*)").Groups[1].Value;
companies.Add(new Dictionary<string, string> { { "name", name }, { "link", link }, { "address", address }, { "tel", tel } });
}
foreach (var company in companies)
{
Console.WriteLine(company["name"]);
Console.WriteLine(company["link"]);
Console.WriteLine(company["address"]);
Console.WriteLine(company["tel"]);
Console.WriteLine();
}
}
}
}
```
这段代码使用 HttpClient 类获取网页内容,使用 HtmlAgilityPack 库解析网页,然后使用正则表达式提取相关信息。同样,你可以根据需要对代码进行修改和优化。
相关推荐
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)