Quantcast
Channel: magick Issue Tracker Rss Feed
Viewing all articles
Browse latest Browse all 1011

Commented Unassigned: Issue when converting a PDF to PNG [1394]

$
0
0
I have noticed issues when converting large PDF (3.5MB and higher) that it freezes the system, and in the course of 5 minutes will create temp files over 10GB in size.

Let me know if you need any info
Comments: I've run into essentially the same issue, but overall, I don't think this is a bug, just more of a performance issue. When running what is basically the example of converting a PDF into multiple PNG images with a reasonably large file (26MB), I see ~25% CPU usage, and upwards of 12GB of memory during the peak of reading the file. I printed to PDF the HTML ebook from Project Gutenburg ["Les Miserables"](https://www.gutenberg.org/files/135/135-h/135-h.htm) for this sample. This is the one-file program I used linked to .NETCoreApp (v1.0), in conjunction with Magick.NET.Core-Q8 v7.0.2.100. ``` using ImageMagick; using System; using System.IO; namespace DocumentConverter { public class Program { public static void Main(string[] args) { var name = @"Les Misérables, Five Volumes, Complete by Victor Hugo"; var cwd = Directory.GetCurrentDirectory(); MagickNET.SetTempDirectory(Path.Combine(cwd, "temp")); // Read the default settings and override the density to 300 dpi for better quality MagickReadSettings settings = new MagickReadSettings(); settings.Density = new Density(300, 300); using (MagickImageCollection images = new MagickImageCollection()) { // Add all the pages of the pdf to the collection images.Read(Path.Combine(cwd, $"{name}.pdf"), settings); int page = 1; foreach (MagickImage image in images) { // Write page to file that contains page number image.Write(Path.Combine(cwd, "out", $"{name} (page {page}).png")); page++; } } } } } ``` During execution, the program "hung" at 'images.Read()', which synchronously read the file into memory and seemed to write temp files for each page of the PDF (899 pages) in the assigned temp directory, then it appeared to roll-up those files into 32MB chunks. This process took approximately 45 minutes, and peaked the consumption of memory at about 12GB, which also ended up consuming approximately 14GB of harddrive space for the temp files. It started shedding memory consumption when it started rolling up the individual page temp files into the chunks. Once it reached to foreach loop, it retained the entire contents of the temp directory throughout the process, and sequentially wrote out each PNG file, taking about a second per image. It also, again started to creep it's memory usage during this process, but didn't really go up past 12GB, fluctuating between 10GB and 11GB through most of the process. All of these stats are based on TaskManager readings and wall-clock timings, they are not at all accurate, but do paint a reasonable picture that the process of splitting a PDF into individual PNGs is not a very quick or efficient process.

Viewing all articles
Browse latest Browse all 1011

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>