这是我到目前为止所拥有的,但我遇到了一个问题。
页面https://xxxxxxxx.zendesk.com/tickets/33126包含指向.jpg图像的链接。我想下载这张图片。页面可能有几个图像,所以我需要扫描页面以找到所有的.jpg、.gif等。
I'm having an issue with my code at the end. I'll explain there.
public static void GetTicketAttachments(string url)
{
GetImages("https://xxxxxxxx.zendesk.com/tickets/33126");
}
static void GetImages(string url)
{
string responseString;
HttpWebRequest initialRequest = (HttpWebRequest)WebRequest.Create(url);
using (HttpWebResponse initialResponse = (HttpWebResponse)initialRequest.GetResponse())
{
using (StreamReader reader = new StreamReader(initialResponse.GetResponseStream()))
{
responseString = reader.ReadToEnd();
}
List<string> imageset = new List<string>();
Regex regex = new Regex(@"f=""[^""]*jpg|bmp|tif|gif|png", RegexOptions.IgnoreCase);
foreach (Match m in regex.Matches(responseString))
{
if (!imageset.Contains(m.Value))
imageset.Add(m.Value);
}
for (int i = 0; i < imageset.Count; i++)
imageset[i] = imageset[i].Remove(0, 3);
totalFiles = imageset.Count;
currentFiles = totalFiles;
foreach (string f in imageset)
{
ThreadPool.QueueUserWorkItem(new WaitCallback(DownloadImage), f);
}
}
}
问题发生在这里。由于某些原因,对象"path“总是空的。因此,我无法下载图片。
static void DownloadImage(object path)
{
currentFiles--;
path = Path.GetFileName(path.ToString());
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(path.ToString());
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
{
Image image = Image.FromStream(response.GetResponseStream());
image.Save(@"C:\" + Path.GetFileName(path.ToString()));
}
}
有人知道问题出在哪里吗?“图片计数”实际上是1(页面上图片的一个链接)。
发布于 2012-11-08 00:16:17
不要试图自己解析文档。看看HTML Agility Pack ( http://htmlagilitypack.codeplex.com/ ),从HTML文档中提取有意义的信息。
https://stackoverflow.com/questions/13273697
复制相似问题