在for循环中每x次运行一次异步

问题描述:

我正在下载100K +个文件,并希望在补丁程序中执行此操作,例如一次执行100个文件。在for循环中每x次运行一次异步

static void Main(string[] args) { 
    Task.WaitAll(
     new Task[]{ 
      RunAsync() 
    }); 
} 

// each group has 100 attachments. 
static async Task RunAsync() { 
    foreach (var group in groups) { 
     var tasks = new List<Task>(); 
     foreach (var attachment in group.attachments) { 
      tasks.Add(DownloadFileAsync(attachment, downloadPath)); 
     } 
     await Task.WhenAll(tasks); 
    } 
} 

static async Task DownloadFileAsync(Attachment attachment, string path) { 
    using (var client = new HttpClient()) { 
     using (var fileStream = File.Create(path + attachment.FileName)) { 
      var downloadedFileStream = await client.GetStreamAsync(attachment.url); 
      await downloadedFileStream.CopyToAsync(fileStream); 
     } 
    } 
} 

预计 希望满月下载100个文件的时间,然后下载下一个100;

实际 它在同一时间下载更多。快速出错Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host

+2

这是它得到了标记为重复一个耻辱,因为其他问题使用显著不同的方法,我很乐意去学习为什么用昆汀使用的一个失败。 – Bartosz

+2

我同意;不是重复的。我的猜测是HttpClient方法会比你期望的更早返回。 – BradleyDotNET

+2

我强烈推荐阅读[.net 4.5的异步HttpClient是密集加载应用程序的不好选择吗?](https://*.com/questions/16194054/is-async-httpclient-from-net-4-5-a -bad选换密集型负载的应用程序)。 –

在“批处理”中运行任务在性能方面不是一个好主意。长时间运行的任务会导致整个批处理块。一个更好的方法是在完成一个任务后立即开始一项新任务。

这可以通过@MertAkcakaya建议的队列来实现。但我将发布基于我的其他答案Have a set of Tasks with only X running at a time

int maxTread = 3; 
System.Net.ServicePointManager.DefaultConnectionLimit = 50; //Set this once to a max value in your app 

var urls = new Tuple<string, string>[] { 
    Tuple.Create("http://cnn.com","temp/cnn1.htm"), 
    Tuple.Create("http://cnn.com","temp/cnn2.htm"), 
    Tuple.Create("http://bbc.com","temp/bbc1.htm"), 
    Tuple.Create("http://bbc.com","temp/bbc2.htm"), 
    Tuple.Create("http://*.com","temp/*.htm"), 
    Tuple.Create("http://google.com","temp/google1.htm"), 
    Tuple.Create("http://google.com","temp/google2.htm"), 
}; 
DownloadParallel(urls, maxTread); 
另一种选择
async Task DownloadParallel(IEnumerable<Tuple<string,string>> urls, int maxThreads) 
{ 
    SemaphoreSlim maxThread = new SemaphoreSlim(maxThreads); 
    var client = new HttpClient(); 

    foreach(var url in urls) 
    { 
     await maxThread.WaitAsync(); 
     DownloadFile(client, url.Item1, url.Item2) 
        .ContinueWith((task) => maxThread.Release()); 
    } 
} 


async Task DownloadFile(HttpClient client, string url, string fileName) 
{ 
    var stream = await client.GetStreamAsync(url); 
    using (var fileStream = File.Create(fileName)) 
    { 
     await stream.CopyToAsync(fileStream); 
    } 
} 

PS:因为它开始的上次下载DownloadParallel将尽快返回。所以不要等待吧。如果你真的想等待,你应该在方法结尾处添加for (int i = 0; i < maxThreads; i++) await maxThread.WaitAsync();

PS2:不要忘了异常处理添加到DownloadFile