CV. PDF functions

简介

The PDF functions in PHP can create PDF files using the PDFlib library created by Thomas Merz.

The documentation in this section is only meant to be an overview of the available functions in the PDFlib library and should not be considered an exhaustive reference. Please consult the documentation included in the source distribution of PDFlib for the full and detailed explanation of each function here. It provides a very good overview of what PDFlib is capable of doing and contains the most up-to-date documentation of all functions.

All of the functions in PDFlib and the PHP module have identical function names and parameters. You will need to understand some of the basic concepts of PDF and PostScript to efficiently use this extension. All lengths and coordinates are measured in PostScript points. There are generally 72 PostScript points to an inch, but this depends on the output resolution. Please see the PDFlib documentation included with the source distribution of PDFlib for a more thorough explanation of the coordinate system used.

Please note that most of the PDF functions require a pdfdoc as its first parameter. Please see the examples below for more information.

注: If you're interested in alternative free PDF generators that do not utilize external PDF libraries, see this related FAQ.

注: This extension has been moved to PECL as of PHP 4.3.9.

需求

PDFlib is available for download at http://www.pdflib.com/products/pdflib/index.html, but requires that you purchase a license for commercial use. The JPEG and TIFF libraries are required to compile this extension.

Issues with older versions of PDFlib

Any version of PHP 4 after March 9, 2000 does not support versions of PDFlib older than 3.0.

PDFlib 3.0 or greater is supported by PHP 3.0.19 and later.

安装

本 PECL 扩展未绑定于 PHP 中。进一步信息例如新版本，下载，源程序，维护者信息以及更新日志可以在此找到： http://pecl.php.net/package/pdflib.

To get these functions to work in PHP < 4.3.9, you have to compile PHP with --with-pdflib[=DIR]. DIR is the PDFlib base install directory, defaults to /usr/local. In addition you can specify the jpeg, tiff, and pnglibrary for PDFlib to use, which is optional for PDFlib 4.x. To do so add to your configure line the options --with-jpeg-dir[=DIR] --with-png-dir[=DIR] --with-tiff-dir[=DIR].

When using version 3.x of PDFlib, you should configure PDFlib with the option --enable-shared-pdflib.

As of PHP 4.3.9, you must install this extension through PEAR, using the following command: pear install pdflib.

运行时配置

本扩展模块在 php.ini 中未定义任何配置选项。

Confusion with old PDFlib versions

Starting with PHP 4.0.5, the PHP extension for PDFlib is officially supported by PDFlib GmbH. This means that all the functions described in the PDFlib manual (V3.00 or greater) are supported by PHP 4 with exactly the same meaning and the same parameters. Only the return values may differ from the PDFlib manual, because the PHP convention of returning FALSE was adopted. For compatibility reasons, this binding for PDFlib still supports the old functions, but they should be replaced by their new versions. PDFlib GmbH will not support any problems arising from the use of these deprecated functions.

表格 1. Deprecated functions and their replacements

Old function	Replacement
pdf_put_image()	Not needed anymore.
pdf_execute_image()	Not needed anymore.
pdf_get_annotation()	pdf_get_bookmark() using the same parameters.
pdf_get_font()	pdf_get_value() passing `"font"` as the second parameter.
pdf_get_fontsize()	pdf_get_value() passing `"fontsize"` as the second parameter.
pdf_get_fontname()	pdf_get_parameter() passing `"fontname"` as the second parameter.
pdf_set_info_creator()	pdf_set_info() passing `"Creator"` as the second parameter.
pdf_set_info_title()	pdf_set_info() passing `"Title"` as the second parameter.
pdf_set_info_subject()	pdf_set_info() passing `"Subject"` as the second parameter.
pdf_set_info_author()	pdf_set_info() passing `"Author"` as the second parameter.
pdf_set_info_keywords()	pdf_set_info() passing `"Keywords"` as the second parameter.
pdf_set_leading()	pdf_set_value() passing `"leading"` as the second parameter.
pdf_set_text_rendering()	pdf_set_value() passing `"textrendering"` as the second parameter.
pdf_set_text_rise()	pdf_set_value() passing `"textrise"` as the second parameter.
pdf_set_horiz_scaling()	pdf_set_value() passing `"horizscaling"` as the second parameter.
pdf_set_text_matrix()	Not available anymore
pdf_set_char_spacing()	pdf_set_value() passing `"charspacing"` as the second parameter.
pdf_set_word_spacing()	pdf_set_value() passing `"wordspacing"` as the second parameter.
pdf_set_transition()	pdf_set_parameter() passing `"transition"` as the second parameter.
pdf_open()	pdf_new() plus an subsequent call of pdf_open_file()
pdf_set_font()	pdf_findfont() plus an subsequent call of pdf_setfont()
pdf_set_duration()	pdf_set_value() passing `"duration"` as the second parameter.
pdf_open_gif()	pdf_open_image_file() passing `"gif"` as the second parameter.
pdf_open_jpeg()	pdf_open_image_file() passing `"jpeg"` as the second parameter.
pdf_open_tiff()	pdf_open_image_file() passing `"tiff"` as the second parameter.
pdf_open_png()	pdf_open_image_file() passing `"png"` as the second parameter.
pdf_get_image_width()	pdf_get_value() passing `"imagewidth"` as the second parameter and the image as the third parameter.
pdf_get_image_height()	pdf_get_value() passing `"imageheight"` as the second parameter and the image as the third parameter.

范例

Most of the functions are fairly easy to use. The most difficult part is probably creating your first PDF document. The following example should help to get you started. It creates test.pdf with one page. The page contains the text "Times Roman outlined" in an outlined, 30pt font. The text is also underlined.

例子 1. Creating a PDF document with PDFlib
<?php $pdf = pdf_new(); pdf_open_file($pdf, "test.pdf"); pdf_set_info($pdf, "Author", "Uwe Steinmann"); pdf_set_info($pdf, "Title", "Test for PHP wrapper of PDFlib 2.0"); pdf_set_info($pdf, "Creator", "See Author"); pdf_set_info($pdf, "Subject", "Testing"); pdf_begin_page($pdf, 595, 842); pdf_add_outline($pdf, "Page 1"); $font = pdf_findfont($pdf, "Times New Roman", "winansi", 1); pdf_setfont($pdf, $font, 10); pdf_set_value($pdf, "textrendering", 1); pdf_show_xy($pdf, "Times Roman outlined", 50, 750); pdf_moveto($pdf, 50, 740); pdf_lineto($pdf, 330, 740); pdf_stroke($pdf); pdf_end_page($pdf); pdf_close($pdf); pdf_delete($pdf); echo "<A HREF=getpdf.php>finished</A>"; ?>
The script getpdf.php just returns the pdf document.
例子 2. Outputting a precalculated PDF
<?php $len = filesize($filename); header("Content-type: application/pdf"); header("Content-Length: $len"); header("Content-Disposition: inline; filename=foo.pdf"); readfile($filename); ?>

The PDFlib distribution contains a more complex example which creates a page with an analog clock. Here we use the in-memory creation feature of PDFlib to alleviate the need to use temporary files. The example was converted to PHP from the PDFlib example. (The same example is available in the CLibPDF documentation.)

例子 3. pdfclock example from PDFlib distribution
<?php $radius = 200; $margin = 20; $pagecount = 10; $pdf = pdf_new(); if (!pdf_open_file($pdf, "")) { echo error; exit; }; pdf_set_parameter($pdf, "warning", "true"); pdf_set_info($pdf, "Creator", "pdf_clock.php"); pdf_set_info($pdf, "Author", "Uwe Steinmann"); pdf_set_info($pdf, "Title", "Analog Clock"); while ($pagecount-- > 0) { pdf_begin_page($pdf, 2 * ($radius + $margin), 2 * ($radius + $margin)); pdf_set_parameter($pdf, "transition", "wipe"); pdf_set_value($pdf, "duration", 0.5); pdf_translate($pdf, $radius + $margin, $radius + $margin); pdf_save($pdf); pdf_setrgbcolor($pdf, 0.0, 0.0, 1.0); /* minute strokes */ pdf_setlinewidth($pdf, 2.0); for ($alpha = 0; $alpha < 360; $alpha += 6) { pdf_rotate($pdf, 6.0); pdf_moveto($pdf, $radius, 0.0); pdf_lineto($pdf, $radius-$margin/3, 0.0); pdf_stroke($pdf); } pdf_restore($pdf); pdf_save($pdf); /* 5 minute strokes */ pdf_setlinewidth($pdf, 3.0); for ($alpha = 0; $alpha < 360; $alpha += 30) { pdf_rotate($pdf, 30.0); pdf_moveto($pdf, $radius, 0.0); pdf_lineto($pdf, $radius-$margin, 0.0); pdf_stroke($pdf); } $ltime = getdate(); /* draw hour hand */ pdf_save($pdf); pdf_rotate($pdf,-(($ltime['minutes']/60.0)+$ltime['hours']-3.0)*30.0); pdf_moveto($pdf, -$radius/10, -$radius/20); pdf_lineto($pdf, $radius/2, 0.0); pdf_lineto($pdf, -$radius/10, $radius/20); pdf_closepath($pdf); pdf_fill($pdf); pdf_restore($pdf); /* draw minute hand */ pdf_save($pdf); pdf_rotate($pdf,-(($ltime['seconds']/60.0)+$ltime['minutes']-15.0)*6.0); pdf_moveto($pdf, -$radius/10, -$radius/20); pdf_lineto($pdf, $radius * 0.8, 0.0); pdf_lineto($pdf, -$radius/10, $radius/20); pdf_closepath($pdf); pdf_fill($pdf); pdf_restore($pdf); /* draw second hand */ pdf_setrgbcolor($pdf, 1.0, 0.0, 0.0); pdf_setlinewidth($pdf, 2); pdf_save($pdf); pdf_rotate($pdf, -(($ltime['seconds'] - 15.0) * 6.0)); pdf_moveto($pdf, -$radius/5, 0.0); pdf_lineto($pdf, $radius, 0.0); pdf_stroke($pdf); pdf_restore($pdf); /* draw little circle at center */ pdf_circle($pdf, 0, 0, $radius/30); pdf_fill($pdf); pdf_restore($pdf); pdf_end_page($pdf); # to see some difference sleep(1); } pdf_close($pdf); $buf = pdf_get_buffer($pdf); $len = strlen($buf); header("Content-type: application/pdf"); header("Content-Length: $len"); header("Content-Disposition: inline; filename=foo.pdf"); echo $buf; pdf_delete($pdf); ?>

参见

注: An alternative PHP module for PDF document creation based on FastIO's ClibPDF is available. Please see the ClibPDF section for details. Note that ClibPDF has a slightly different API than PDFlib.

目录
pdf_add_annotation -- Deprecated: Adds annotation
pdf_add_bookmark -- Adds bookmark for current page
pdf_add_launchlink -- Add a launch annotation for current page
pdf_add_locallink -- Add a link annotation for current page
pdf_add_note -- Sets annotation for current page
pdf_add_outline -- Deprecated: Adds bookmark for current page
pdf_add_pdflink -- Adds file link annotation for current page
pdf_add_thumbnail -- Adds thumbnail for current page
pdf_add_weblink -- Adds weblink for current page
pdf_arc -- Draws an arc (counterclockwise)
pdf_arcn -- Draws an arc (clockwise)
pdf_attach_file -- Adds a file attachment for current page
pdf_begin_page -- Starts new page
pdf_begin_pattern -- Starts new pattern
pdf_begin_template -- Starts new template
pdf_circle -- Draws a circle
pdf_clip -- Clips to current path
pdf_close_image -- Closes an image
pdf_close_pdi_page -- Close the page handle
pdf_close_pdi -- Close the input PDF document
pdf_close -- Closes a pdf resource
pdf_closepath_fill_stroke -- Closes, fills and strokes current path
pdf_closepath_stroke -- Closes path and draws line along path
pdf_closepath -- Closes path
pdf_concat -- Concatenate a matrix to the CTM
pdf_continue_text -- Outputs text in next line
pdf_curveto -- Draws a curve
pdf_delete -- Deletes a PDF object
pdf_end_page -- Ends a page
pdf_end_pattern -- Finish pattern
pdf_end_template -- Finish template
pdf_endpath -- Deprecated: Ends current path
pdf_fill_stroke -- Fills and strokes current path
pdf_fill -- Fills current path
pdf_findfont -- Prepare font for later use with pdf_setfont()
pdf_get_buffer -- Fetch the buffer containing the generated PDF data
pdf_get_font -- Deprecated: font handling
pdf_get_fontname -- Deprecated: font handling
pdf_get_fontsize -- Deprecated: font handling
pdf_get_image_height -- Deprecated: returns height of an image
pdf_get_image_width -- Deprecated: Returns width of an image
pdf_get_majorversion -- Returns the major version number of the PDFlib
pdf_get_minorversion -- Returns the minor version number of the PDFlib
pdf_get_parameter -- Gets certain parameters
pdf_get_pdi_parameter -- Get some PDI string parameters
pdf_get_pdi_value -- Gets some PDI numerical parameters
pdf_get_value -- Gets certain numerical value
pdf_initgraphics -- Resets graphic state
pdf_lineto -- Draws a line
pdf_makespotcolor -- Makes a spotcolor
pdf_moveto -- Sets current point
pdf_new -- Creates a new pdf resource
pdf_open_ccitt -- Opens a new image file with raw CCITT data
pdf_open_file -- Opens a new pdf object
pdf_open_gif -- Deprecated: Opens a GIF image
pdf_open_image_file -- Reads an image from a file
pdf_open_image -- Versatile function for images
pdf_open_jpeg -- Deprecated: Opens a JPEG image
pdf_open_memory_image -- Opens an image created with PHP's image functions
pdf_open_pdi_page -- Prepare a page
pdf_open_pdi -- Opens a PDF file
pdf_open_png -- Deprecated: Opens a PNG image
pdf_open_tiff -- Deprecated: Opens a TIFF image
pdf_open -- Deprecated: Open a new pdf object
pdf_place_image -- Places an image on the page
pdf_place_pdi_page -- Places an image on the page
pdf_rect -- Draws a rectangle
pdf_restore -- Restores formerly saved environment
pdf_rotate -- Sets rotation
pdf_save -- Saves the current environment
pdf_scale -- Sets scaling
pdf_set_border_color -- Sets color of border around links and annotations
pdf_set_border_dash -- Sets dash style of border around links and annotations
pdf_set_border_style -- Sets style of border around links and annotations
pdf_set_char_spacing -- Deprecated: Sets character spacing
pdf_set_duration -- Deprecated: Sets duration between pages
pdf_set_font -- Deprecated: Selects a font face and size
pdf_set_horiz_scaling -- Sets horizontal scaling of text [deprecated]
pdf_set_info_author -- Deprecated: Fills the author field of the document
pdf_set_info_creator -- Deprecated: Fills the creator field of the document
pdf_set_info_keywords -- Deprecated: Fills the keywords field of the document
pdf_set_info_subject -- Deprecated: Fills the subject field of the document
pdf_set_info_title -- Deprecated: Fills the title field of the document
pdf_set_info -- Fills a field of the document information
pdf_set_leading -- Deprecated: Sets distance between text lines
pdf_set_parameter -- Sets certain parameters
pdf_set_text_matrix -- Deprecated: Sets the text matrix
pdf_set_text_pos -- Sets text position
pdf_set_text_rendering -- Deprecated: Determines how text is rendered
pdf_set_text_rise -- Deprecated: Sets the text rise
pdf_set_value -- Sets certain numerical value
pdf_set_word_spacing -- Deprecated: Sets spacing between words
pdf_setcolor -- Sets fill and stroke color
pdf_setdash -- Sets dash pattern
pdf_setflat -- Sets flatness
pdf_setfont -- Set the current font
pdf_setgray_fill -- Sets filling color to gray value
pdf_setgray_stroke -- Sets drawing color to gray value
pdf_setgray -- Sets drawing and filling color to gray value
pdf_setlinecap -- Sets linecap parameter
pdf_setlinejoin -- Sets linejoin parameter
pdf_setlinewidth -- Sets line width
pdf_setmatrix -- Sets current transformation matrix
pdf_setmiterlimit -- Sets miter limit
pdf_setpolydash -- Deprecated: Sets complicated dash pattern
pdf_setrgbcolor_fill -- Sets filling color to rgb color value
pdf_setrgbcolor_stroke -- Sets drawing color to rgb color value
pdf_setrgbcolor -- Sets drawing and filling color to rgb color value
pdf_show_boxed -- Output text in a box
pdf_show_xy -- Output text at given position
pdf_show -- Output text at current position
pdf_skew -- Skews the coordinate system
pdf_stringwidth -- Returns width of text using current font
pdf_stroke -- Draws line along path
pdf_translate -- Sets origin of coordinate system

add a note User Contributed Notes

phpguy at theos dot me dot uk
01-Mar-2006 09:17


On my system at least (debian stable) the command to install pdflib is not



pear install pdflib



but rather



pecl install pdflib

spingary at yahoo dot com
13-Jan-2006 04:55


I was having trouble with streaming inline PDf's using PHP 5.0.2, Apache 2.0.54.



This is my code:



<?

header("Pragma: public");

header("Expires: Mon, 26 Jul 1997 05:00:00 GMT");

header("Last-Modified: " . gmdate("D, d M Y H:i:s") . " GMT");

header("Cache-Control: must-revalidate");

header("Content-type: application/pdf");

header("Content-Length: ".filesize($file));

header("Content-disposition: inline; filename=$file");

header("Accept-Ranges: ".filesize($file)); 

readfile($file);

exit();

?>

It would work fine in Mozilla Firefox (1.0.7) but with IE (6.0.2800.1106) it would not bring up the Adobe Reader plugin and instead ask me to save it or open it as a PHP file.



Oddly enough, I turned off ZLib.compression and it started working.  I guess the compression is confusing IE.  I tried leaving out the content-length header thinking maybe it was unmatched filesize (uncompressed number vs actual received compressed size), but then without it it screws up Firefox too.  



What I ended up doing was disabling Zlib compression for the PDF output pages using ini_set:



<?

ini_set('zlib.output_compression','Off'); 

?>



Maybe this will help someone. Will post over in the PDF section as well.

davedotmarshallatcspencerltddotcodotuk
08-Nov-2005 08:17


RE: thodge at ipswich dot qld dot gov dot au



I think the line: 



   preg_match_all(

       '/(T[wdcm*])[\s]*(\[([^\]]*)\]|\(([^\)]*)\))[\s]*Tj/si',

       $postScriptData,

       $matches

   );



should read:



   preg_match_all(

       '/(T[wdcm*])[\s]*(\[([^\]]*)\]|\(([^\)]*)\))[\s]*Tj/si',

       $psData,

       $matches

   );

ontwerp AT zonnet.nl
04-Nov-2005 03:01


I was searching for a lowcost/opensource option for combining static html files [as templates] and dynamic output from perl or php routines etc. And the sooner or later I found out that this was the most stable, 'speedest' and customizeable way to produce usable pdf 's with nice formatting :



1] create html page output [perl-> html output, direct html output from any app or php echo's etc. [sort these html files locally]



2] parse all html [inluding webimages links, tables font formatting etc] to [E]PS files with the perl app : html2ps [as mentioned beneath] 

http://user.it.uu.se/~jan/html2ps.html [sort all ps files by future pdf page positions]



3] use the free ps2pdf/ps2pdfwr linux application 

http://www.ps2pdf.com/convert/index.htm [uses gostscript, ghostview libs and so on etc]

Has great formatting options like headers, footers, numbering etc

[sort pdf files]



4] convert all pdf files to 1 pdf file with : pdftk [pdftoolkit], deliveres optional compressions/encryption, background stamps etc



One should ask why using different scripts :

- combination perl/php is great : perl is speedier at some issues like conversion to ps files in my experience

- ps to pdf is quickier then direct php to pdf [in my exp.!]

- I have total control over every files whenever i change html files as a template I use only editors or other app. for it [online or offline].



p.s. I had to make a opensource solution for creating simpel report analyses that's based on things like :

- first page [name / title / #/ date]

- some static info [like introduction, copyrights etc]

- some dynamic info [outputted from php->dbase queries] combined

with html tags/images etc.



And this all mixed [so seperated in files for transparancy]. Also the 3 way manner : data-> html, html->ps, ps->pdf, is easier and quickier to program or adjust in every step.



Correct me if i'm wrong [mail me to]



ing. Valentijn Langendorff

Design & Technologist

g8z at yahoo dot com
16-Oct-2005 05:49


For anyone who's in need of a good HTML-to-PDF generator that uses PDFLib:



http://www.tufat.com/html2ps.php



This is free, GPL software, with 100% of the source code included.

ragnar at deulos dot com
08-Oct-2005 10:30


After one hole day understanding how pdflib works i got the conclusion that its enough hard to draw just with words to furthermore for drawing a line maybe you will need something like four lines of code, so i did my own functions to do the life easier and the code more understable to modify and draw. I also made a function that will draw a rect with the corners round and the posibility even to fill it ;)



You can get it from http://www.deulos.com/pdf_php.php



feel free to make suggestions or whatever u like ;o)

18-Sep-2005 02:26


some code that can be very helpful for starters.



<?php



    // Declare PDF File



    $pdf = pdf_new();

    PDF_open_file($pdf);



    // Set Document Properties



    PDF_set_info($pdf, "author", "Alexander Pas");

    PDF_set_info($pdf, "title", "PDF by PHP Example");

    PDF_set_info($pdf, "creator", "Alexander Pas");

    PDF_set_info($pdf, "subject", "Testing Code");



    // Get fonts to use



    pdf_set_parameter($pdf, "FontOutline", "Arial=arial.ttf"); // get a custom font

    $font1 = PDF_findfont($pdf, "Helvetica-Bold",  "winansi", 0); // declare default font

    $font2 = PDF_findfont($pdf, "Arial",  "winansi", 1); // declare custom font & embed into file



    /*

    You can use the following Fontypes 14 safely (the default fonts)

    Courier, Courier-Bold, Courier-Oblique, Courier-BoldOblique 

    Helvetica, Helvetica-Bold, Helvetica-Oblique, Helvetica-BoldOblique 

    Times-Roman, Times-Bold, Times-Italic, Times-BoldItalic 

    Symbol, ZapfDingbats

    */



    // make the images



    $image1 = PDF_open_image_file($pdf, "gif", "image.gif"); //supported filetypes are: jpeg, tiff, gif, png.



    //Make First Page



    PDF_begin_page($pdf, 450, 450); // page width and height.

    $bookmark = PDF_add_bookmark($pdf, "Front"); // add a top level bookmark.

    PDF_setfont($pdf, $font1, 12); // use this font from now on.

    PDF_show_xy($pdf, "First Page!", 5, 225); // show this text measured from the left top.

    pdf_place_image($pdf, $image1, 255, 5, 1); // last number will schale it.

    PDF_end_page($pdf); // End of Page.



    //Make Second Page



    PDF_begin_page($pdf, 450, 225); // page width and height.

    $bookmark1 = PDF_add_bookmark($pdf, "Chapter1", $bookmark); // add a nested bookmark. (can be nested multiple times.)

    PDF_setfont($pdf, $font2, 12); // use this font from now on.

    PDF_show_xy($pdf, "Chapter1!", 225, 5);

    PDF_add_bookmark($pdf, "Chapter1.1", $bookmark1); // add a nested bookmark (already in a nested one).

    PDF_setfont($pdf, $font1, 12);

    PDF_show_xy($pdf, "Chapter1.1", 225, 5);

    PDF_end_page($pdf);

    

    // Finish the PDF File

    

    PDF_close($pdf); // End Of PDF-File.

    $output = PDF_get_buffer($pdf); // assemble the file in a variable.



    // Output Area



    header("Content-type: application/pdf"); //set filetype to pdf.

    header("Content-Length: ".strlen($output)); //content length

    header("Content-Disposition: attachment; filename=test.pdf"); // you can use inline or attachment.

    echo $output; // actual print area!



    // Cleanup



    PDF_delete($pdf); 

?>

thodge at ipswich dot qld dot gov dot au
05-Sep-2005 01:22


Yet another addition to the PDF text extraction code last posted by jorromer. The code only seemed to work for PDF 1.2 (Acrobat 3.x) or below. This pdfExtractText function uses regular expressions to cover cases I have found in PDF 1.3 and 1.4 documents. The code also handles closing brackets in the text stream, which were ignored by the previous version. My regular expression skills are somewhat lacking, so improvements may possible by a more skilled programmer. I'm sure there are still cases that this function will not handle, but I haven't come across any yet...



<?php



function pdf2string($sourcefile) {



    $fp = fopen($sourcefile, 'rb');

    $content = fread($fp, filesize($sourcefile));

    fclose($fp);



    $searchstart = 'stream';

    $searchend = 'endstream';

    $pdfText = '';

    $pos = 0;

    $pos2 = 0;

    $startpos = 0;



    while ($pos !== false && $pos2 !== false) {



        $pos = strpos($content, $searchstart, $startpos);

        $pos2 = strpos($content, $searchend, $startpos + 1);



        if ($pos !== false && $pos2 !== false){



            if ($content[$pos] == 0x0d && $content[$pos + 1] == 0x0a) {

                $pos += 2;

            } else if ($content[$pos] == 0x0a) {

                $pos++;

            }



            if ($content[$pos2 - 2] == 0x0d && $content[$pos2 - 1] == 0x0a) {

                $pos2 -= 2;

            } else if ($content[$pos2 - 1] == 0x0a) {

                $pos2--;

            }



            $textsection = substr(

                $content, 

                $pos + strlen($searchstart) + 2, 

                $pos2 - $pos - strlen($searchstart) - 1

            );

            $data = @gzuncompress($textsection);

            $pdfText .= pdfExtractText($data);

            $startpos = $pos2 + strlen($searchend) - 1;



        }

    }



    return preg_replace('/(\s)+/', ' ', $pdfText);



}



function pdfExtractText($psData){



    if (!is_string($psData)) {

        return '';

    }



    $text = '';



    // Handle brackets in the text stream that could be mistaken for

    // the end of a text field. I'm sure you can do this as part of the 

    // regular expression, but my skills aren't good enough yet.

    $psData = str_replace('\)', '##ENDBRACKET##', $psData);

    $psData = str_replace('\]', '##ENDSBRACKET##', $psData);



    preg_match_all(

        '/(T[wdcm*])[\s]*(\[([^\]]*)\]|\(([^\)]*)\))[\s]*Tj/si', 

        $postScriptData, 

        $matches

    );

    for ($i = 0; $i < sizeof($matches[0]); $i++) {

        if ($matches[3][$i] != '') {

            // Run another match over the contents.

            preg_match_all('/\(([^)]*)\)/si', $matches[3][$i], $subMatches);

            foreach ($subMatches[1] as $subMatch) {

                $text .= $subMatch;

            }

        } else if ($matches[4][$i] != '') {

            $text .= ($matches[1][$i] == 'Tc' ? ' ' : '') . $matches[4][$i];

        }

    }



    // Translate special characters and put back brackets.

    $trans = array(

        '...'                => '&hellip;',

        '\205'                => '&hellip;',

        '\221'                => chr(145),

        '\222'                => chr(146),

        '\223'                => chr(147),

        '\224'                => chr(148),

        '\226'                => '-',

        '\267'                => '&bull;',

        '\('                => '(',

        '\['                => '[',

        '##ENDBRACKET##'    => ')',

        '##ENDSBRACKET##'    => ']',

        chr(133)            => '-',

        chr(141)            => chr(147),

        chr(142)            => chr(148),

        chr(143)            => chr(145),

        chr(144)            => chr(146),

    );

    $text = strtr($text, $trans);



    return $text;



}



?>

29-Aug-2005 12:58


If you want to display the number of pages (for example: page 1 of 3) then the following code could be helpful:



<?php 

... 



$pdf->begin_page_ext(842,595 , "");

  .. add text,images,...

$pdf->suspend_page("");



$pdf->begin_page_ext(842,595 , "");

  .. add text,images,...

$pdf->suspend_page("");



... create all pages



$pdf->resume_page("pagenumber 1");

... add number of pages to page 1

$pdf->end_page_ext("");



$pdf->resume_page("pagenumber 2");

... add number of pages to page 2

$pdf->end_page_ext("");



...

?>

bg@msdotcom
07-Aug-2005 02:09


didn't see these mentioned:



another free pdf lib class in php

http://www.gnuvox.com/pdf4php/



another

http://www.potentialtech.com/ppl.php

jorromer at uchile dot cl -- Krash
08-Jun-2005 01:51


I recently use mattb code below for the extraction of text from PDF files. I modify this code for only extract text fields.



Hope i can help some one



Here is the Function



<?php



  $text = pdf2string("file.pdf");

  echo $text;



  function pdf2string($sourcefile){

    $fp = fopen($sourcefile, 'rb');

    $content = fread($fp, filesize($sourcefile));

    fclose($fp);



    $searchstart = 'stream';

    $searchend = 'endstream';

    $pdfdocument = '';

    $pos = 0;

    $pos2 = 0;

    $startpos = 0;

   

    while( $pos !== false && $pos2 !== false ){

      $pos = strpos($content, $searchstart, $startpos);

      $pos2 = strpos($content, $searchend, $startpos + 1);

     

      if ($pos !== false && $pos2 !== false){

        if ($content[$pos]==0x0d && $content[$pos+1]==0x0a) $pos+=2;

        else if ($content[$pos]==0x0a) $pos++;



        if ($content[$pos2-2]==0x0d && $content[$pos2-1]==0x0a) $pos2-=2;

        else if ($content[$pos2-1]==0x0a) $pos2--;



        $textsection = substr($content, $pos + strlen($searchstart) + 2, $pos2 - $pos - strlen($searchstart) - 1);

        $data = @gzuncompress($textsection);

        $data = ExtractText2($data);

        $startpos = $pos2 + strlen($searchend) - 1;

        

        if ($data === false){ 

          return -1;}

          

        $pdfdocument .= $data;}}

   return $pdfdocument;}



function ExtractText2($postScriptData){

  $sw = true;

  $textStart = 0;

  $len = strlen($postScriptData);



  while ($sw){

    $ini = strpos($postScriptData, '(', $textStart);

    $end = strpos($postScriptData, ')', $textStart+1);

    if (($ini>0) && ($end>$ini)){

      $valtext = strpos($postScriptData,'Tj',$end+1);

      if ($valtext == $end + 2)

        $text .= substr($postScriptData,$ini+1,$end - $ini - 1);}

      

    $textStart = $end + 1;

    if ($len<=$textStart) $sw=false;

    

    if (($ini == 0) && ($end == 0)) $sw=false;}

  

  $trans = array("\\341" => "a","\\351" => "e","\\355" => "i","\\363" => "o","\\223" => "","\\224" => "");

  $text  = strtr($text, $trans);

  return $text;

} 

?>

jonathan dot beckett at gmail dot com
06-Jun-2005 06:03


After spending ages writing my own PDF to text extraction routine (well... a couple of hours), I realised that you have to interpret the entire stream to have a hope of getting all the characters you really want - so I started digging.



I then discovered that the XPDF project has everything you need to deal with PDFs - Linux and Win32 binaries are available. Most distro's have the RPMs too.



The resultant command is thus;



$result = shell_exec("pdftotext -raw ".$filename." -");



...it works perfectly for content searching purposes.

q
02-Jun-2005 05:24


It seems that the newest adobe reader 7 (using pdf 1.6) is no longer fully compatible with pdfs generated with PDFlib <= 5. The solution is to upgrade to PDFlib 6. Unfortunately, this means coughing up some more cash to the authors, if you need to get rid of the watermark.

santa at selekcia dot com
19-May-2005 02:53


used function pdf2string does not work corectly with all PDFs. There are problems when in PDF are used 0x0D, 0x0A as line separator. Better way is detect length via /Length tag and detect first 2 chars if they are 0x0d or 0x0d and 0x0a both.



When I update this code i will send it, but if someone have now changed it please, publish it. May be it would be better to extend standard PDF lib included to PHP to add functionality to postprocess PDFs. It is usefull sometime to use for example templates, and so.



Thnx to all developpers extending PHP functions and base team.

webadmin at secretscreen dot com
06-Apr-2005 05:51


I found this info about pdflib scope on a Chinese (I think) site and translated it.  I was trying to do pdf_setfont and kept getting the wrong scope error.  Turns out it has to be in the Page scope.  So pdf_setfont will only work when called between pdf_begin_page and pdf_end_page.



#########################################

When API of the PDFlib is called, the error, Can't - IN 'document' scope occurs 

There is a concept of " the scope " in the PDFlib, as for all API of the PDFlib it is called with some scope, the *1 which is decided This error occurs when it is called other than the scope where API is appointed. The chart below in reference, please verify API call position.



Path: PDF_moveto (), PDF_circle (), PDF_arc (), PDF_arcn (), PDF_rect () in each case PDF_stroke (), PDF_closepath_stroke (), PDF_fill (), PDF_fill_stroke (), PDF_closepath_fill_stroke (), PDF_clip (), PDF_endpath () the between 



Page: PDF_begin_page () with PDF_end_page () in between outside path  



Template: PDF_begin_template () with PDF_end_template () in between outside path  



Pattern: PDF_begin_pattern () with PDF_end_pattern () in between outside path  



Font: PDF_begin_font () with PDF_end_font () in between outside glyph  



Glyph: PDF_begin_glyph () with PDF_end_glyph () in between outside path  



Document: PDF_open_* () with PDF_close () in between outside page tempalte and pattern  



Object: The PDF_new () with the PDF_delete () it belongs to the other no scope in between the place 



Null: Outside object  



Any: All scopes other than  



##########################################



Hope this helps others as much as it helped me!!!

kevin at kevinnading dot com
31-Mar-2005 04:46


Hey people.. the bug with IE not accepting a pdf created via post.. If you can use a get method instead then it will work fine. both post and get methods work in firefox, but only the get method seems to work in IE. However, you may use a content-disposition attachment(means requires user interaction) to popup an open/save dialog box to the user and post/get both work in IE and firefox. Hope this helps!

beanjammin dot removethis at gmail dot com
31-Mar-2005 02:32


This was originally posted by mat3582 at NOSPAM dot hotmail dot com on the Session Handling Functions manual page, however as it is pdf specific I hope that moving it here will make it easier for others to find.



I fought this for longer than I'd care to admit after a web server distros switch before discovering my problem was session related and subsequently discovering Mat's post.



// Mats Note:



Outputting a pdf file to a MSIE browser didn't work (MSIE mistook the file for an Active-X control,

then failed to download) untill I added

<?php

ini_set('session.cache_limiter',"0");

?>

to my script. I hope this will help someone else.



// End Mats Note



In addition to Mat's suggestion the php.ini file can also be edited to add/change the session.cach_limiter setting to 0.

chu61 dot tw at gmail dot com
07-Mar-2005 11:57


How to get how many pages in a PDF? I read PDF spec. V1.6 and find this:



PDF set  a "Page Tree Node" to define the ordering of pages in the document. The tree structure allows PDF applications, using little memory to quickly open a document containing thousands of pages.



If a PDF have 63 pages, the page tree node will like this...



2 0 obj

<< /Type /Pages

    /Kidsn [ 4 0 R

               10 0 R

             ]

     /Count 63        <---- YES, got it

>>

endobj



[P.S]   a  PDF may not only a pages tree node, The right answer is in "root page tree node", if  /Count XX with  /Parent XXX node, it not "root page tree node"



SO, You must find the node with /Count XX and Without /Parent  terms, and you'll get total pages of PDF



%PDF-1.0  ~  %PDF-1.5 all works



Alex form Taipei,Taiwan

mattb at bluewebstudios dot com
05-Feb-2005 05:44


I recently tested Donatas' code below for the extraction of text from PDF files.  After running into a few problems where PDF files were not being read at all, I've modified it somewhat.  It still isn't perfect, but should work great for searching.  Thanks Donatas.



<?php

$test = pdf2string("<pathtoPDFfile>");

echo "$test";



# Returns a -1 if uncompression failed

function pdf2string($sourcefile)

{

   $fp = fopen($sourcefile, 'rb');

   $content = fread($fp, filesize($sourcefile));

   fclose($fp);



   # Locate all text hidden within the stream and endstream tags

   $searchstart = 'stream';

   $searchend = 'endstream';

   $pdfdocument = "";



   $pos = 0;

   $pos2 = 0;

   $startpos = 0;

   # Iterate through each stream block

   while( $pos !== false && $pos2 !== false )

   {

      # Grab beginning and end tag locations if they have not yet been parsed

      $pos = strpos($content, $searchstart, $startpos);

      $pos2 = strpos($content, $searchend, $startpos + 1);

      if( $pos !== false && $pos2 !== false )

      {

         # Extract compressed text from between stream tags and uncompress

         $textsection = substr($content, $pos + strlen($searchstart) + 2, $pos2 - $pos - strlen($searchstart) - 1);

         $data = @gzuncompress($textsection);

         # Clean up text via a special function

         $data = ExtractText($data);

         # Increase our PDF pointer past the section we just read

         $startpos = $pos2 + strlen($searchend) - 1;

         if( $data === false ) { return -1; }

         $pdfdocument = $pdfdocument . $data;

      }

   }



   return $pdfdocument;

}



function ExtractText($postScriptData)

{

   while( (($textStart = strpos($postScriptData, '(', $textStart)) && ($textEnd = strpos($postScriptData, ')', $textStart + 1)) && substr($postScriptData, $textEnd - 1) != '\\') )

   {

      $plainText .= substr($postScriptData, $textStart + 1, $textEnd - $textStart - 1);

      if( substr($postScriptData, $textEnd + 1, 1) == ']' ) // This adds quite some additional spaces between the words

      {

         $plainText .= ' ';

      }



      $textStart = $textStart < $textEnd ? $textEnd : $textStart + 1;

   }



   return stripslashes($plainText);

}

?>

ken at thesmallbox.com
30-Oct-2004 11:13


Please note that these functions have been removed from PHP 5. They are still available through the pdflib PECL module.

14-Aug-2004 02:58


for people who are using PDF_FINDFONT there is a catch..

--------------------------------------------------------



int PDF_findfont(PDF *p, const char *fontname, const char *encoding, int embed)



Deprecated, use PDF_load_font( ).



---- 

use PDF_load_font instead....

arjen at queek dot nl
15-Jul-2004 10:50


If you prefer a OO-approach to the PDF-functions, you can use this snippet of code (PHP5 only! and does add some overhead). It's just a "start-up", extend/improve as you wish...

You can pass all pdf_* functions to your object and stripping pdf_ of the function name. Plus, you don't have to pass the pdf-resource as the first argument.



For example:

<?php

pdf_show($pdf, $text);    // Where $pdf is your pdf-resource

?>



Can become:

<?php

$pdf->show($text);        // Where $pdf is your PDF-object

?>



Code:

<?php



class PDF {



    private $pdf;

    

    /* public Void __construct(): Constructor */

    public function __construct() {

        $this->pdf = pdf_new();

    }

    

    /* public Mixed __call(): Re-route all function calls to the PHP-functions */

    public function __call($function, $arguments) {

        // Prepend the pdf resource to the arguments array

        array_unshift($arguments, $this->pdf);

        

        // Call the PHP function

        return call_user_func_array('pdf_' . $function, $arguments);

    }



}



?>

michi (Alt+Q) marel.at
01-Jul-2004 10:10


<?PHP

/* A little helpful function to calculate millimeters to points */

function calcToPt($intMillimeter) {

  $intPoints = ($intMillimeter*72)/25.4;

  $intPoints = round($intPoints);

  return $intPoints;

}



/* For example: Create DIN A4 210x297 mm */

pdf_begin_page( $pdf, calcToPt(210), calcToPt(297)); // 595x842 pt

?>

donatas at spurgius dot com
23-Jun-2004 03:56


I've been looking for a way to extract plain text from PDF documents (needed to search for text inside 'em). Not being able to find one I wrote the needed functions myself. here you go folks.



<?php

  function pdf2string ($sourceFile)

  {

    $textArray = array ();

    $objStart = 0;

    

    $fp = fopen ($sourceFile, 'rb');

    $content = fread ($fp, filesize ($sourceFile));

    fclose ($fp);

    

    $searchTagStart = chr(13).chr(10).'stream';

    $searchTagStartLenght = strlen ($searchTagStart);

    

    while ((($objStart = strpos ($content, $searchTagStart, $objStart)) && ($objEnd = strpos ($content, 'endstream', $objStart+1))))

    {

      $data = substr ($content, $objStart + $searchTagStartLenght + 2, $objEnd - ($objStart + $searchTagStartLenght) - 2);

      $data = @gzuncompress ($data);

      

      if ($data !== FALSE && strpos ($data, 'BT') !== FALSE && strpos ($data, 'ET') !== FALSE)

      {

        $textArray [] = ExtractText ($data);

      }

      

      $objStart = $objStart < $objEnd ? $objEnd : $objStart + 1;

    }

    

    return $textArray;

  }

  

  function ExtractText ($postScriptData)

  {

    while ((($textStart = strpos ($postScriptData, '(', $textStart)) && ($textEnd = strpos ($postScriptData, ')', $textStart + 1)) && substr ($postScriptData, $textEnd - 1) != '\\'))

    {

      $plainText .= substr ($postScriptData, $textStart + 1, $textEnd - $textStart - 1);

      if (substr ($postScriptData, $textEnd + 1, 1) == ']') //this adds quite some additional spaces between the words

      {

        $plainText .= ' ';

      }

      

      $textStart = $textStart < $textEnd ? $textEnd : $textStart + 1;

    }

    

    return stripslashes ($plainText);

  }

?>

uwe at steinmann dot cx
13-May-2004 09:25


Those looking for a free replacement of pdflib may consider

pslib at http://pslib.sourceforge.net which produces PostScript but it can be easily turned into PDF by Acrobat Distiller or ghostscript. The API is very similar and even hypertext functions are supported. There

is also a php extension for pslib in PECL, called ps.

samcontact at myteks dot com
01-May-2004 07:28


Here is another great tutorial on basic PDF building w/ PHP:

http://hotwired.lycos.com/webmonkey/02/20/index3a.html?tw=programming



=======================

http://myteks.com 

Computer Repair & Web Design

=======================

james at lanpad dot org
19-Apr-2004 11:36


PDFLib has a free replacement, that also is much easier to work with too (no more working with co-ordinates from the bottom left hand corner!)!



http://www.fpdf.org



Its also free for commercial use, and is very useable, unlike the PDFlib extensions.

kristian at ruazgo dot com
12-Mar-2004 06:32


If you want an opensource class for creating PDF-files, you can find it at :

http://ros.co.nz/pdf/

matic at koncan dot net
12-Jan-2004 10:22


The solution for IE (refresh):

...

$buf = PDF_get_buffer($p);

$len = strlen($buf);

header("Cache-Control: no-store");

header("Cache-Control: no-cache");

header("Cache-Control: must-revalidate");

header("Content-type: application/pdf");

header("Content-Length: $len");

header("Content-Disposition: inline; filename=file.pdf");

print $buf;

PDF_delete($p);

SenorTZ senortz at nospam dot yahoo dot com
28-Jul-2003 09:23


About creating a PDF document based on the content of another document(let's say a text file):



I have tried to send to the PDF-creator page from a link from the sender page the file name of the file I want to read the content from and generate the PDF document containing this content. The idea is is that when I tried to reffer the pdf-creator page via the link your_root/create_pdf.php?filename=$your_file_name, the pdf-creator page does not behave well when before creating the pdf document I have a line like $filename = $_GET["filename"].

I solved this using on the sender page instead of the link a form with a button, so the form has as action "create_pdf.php", as method "post" and a hidden field containing the "filename" value. And it works like this if, on the pdf-creator page I have a line like $filename = $_POST["filename"].



I would like to understand why this way it works and the other way does not.



I hope this helps. Here are the pieces of code I used.



Sender page:

print("<form name='to_pdf' action='see_pdf_file.php' method='post'>");

print("<br/><input type='submit' value='PDF'><input type='hidden' name='filename' value='$filename'></form>");



PDF-creator page:

<?

$filename = $_POST["filename"];

$file_handle = fopen($filename, "r");

$file_content = file_get_contents($filename);

fclose($file_handle);

//

$file_content = wordwrap($file_content,72,"|");

$a_row = explode("|",$file_content);

$i = 0;

//

$pdf = pdf_new();

pdf_open_file($pdf, "");

pdf_begin_page($pdf, 595, 842);

pdf_set_font($pdf, "Times-Roman", 16, "host");

pdf_add_outline($pdf, "Page 1");

pdf_set_value($pdf, "textrendering", 1);

pdf_show_xy($pdf, 'The content of the file:',50,700);

while ($a_row[$i] != "")

{

       pdf_continue_text($pdf,$a_row[$i]);

       $i++;

}

pdf_end_page($pdf);

pdf_close($pdf);

//

$data = pdf_get_buffer($pdf);

//

header("Content-type: application/pdf");

header("Content-disposition: inline; filename=test.pdf");

header("Content-length: " . strlen($data));

//

echo $data;

?>



PDFLib and PHP 431 used.



Thanks.

bmironov at jonview dot com
25-Jun-2003 06:46


RedHat 9 + Apache 2.0 + PHP 4.3.2 + Oracle 9i + PDFlib 5.0.1 (binary distribution)



It seems to be a working bundle if you do some magic with ./configure:



RedHat 9:

kernel-2.4.20-18.9



Apache 2.0.46:

./configure --enable-so --enable-rewrite=shared --enable-status --enable-mpm=prefork



PHP 4.3.2:

./configure \

--program-prefix= \

--prefix=/usr \

--exec-prefix=/usr \

--bindir=/usr/bin \

--sbindir=/usr/sbin \

--sysconfdir=/etc \

--datadir=/usr/share \

--includedir=/usr/include \

--libdir=/usr/lib \

--libexecdir=/usr/libexec \

--localstatedir=/var \

--sharedstatedir=/usr/com \

--mandir=/usr/share/man \

--infodir=/usr/share/info \

--with-config-file-path=/etc \

--with-config-file-scan-dir=/etc/php.d \

--without-tsrm-pthreads \    # !!!!!!!!!!!!!!!!!!!!

--with-zlib \

--with-gd \

--enable-gd-native-ttf \

--with-ttf \

--without-mysql \

--with-apxs2filter=/usr/local/apache2/bin/apxs \

--with-oci8 \

--enable-sigchild \

--enable-inline-optimization



Oracle9i:

ln -s $ORACLE_HOME/rdbms/public/nzerror.h $ORACLE_HOME/rdbms/demo/nzerror.h



ln -s $ORACLE_HOME/rdbms/public/nzt.h $ORACLE_HOME/rdbms/demo/nzt.h



ln -s $ORACLE_HOME/rdbms/public/ociextp.h $ORACLE_HOME/rdbms/demo/ociextp.h



If you want to use bundled GD-library then:

1) install following packages: libjpeg, libjpeg-devel, libpng, libpng-devel, freetype, freetype-devel, libtiff, libtiff-devel, zlib, zlib-devel



2) ln -s /usr/lib/libjpeg.so.62 /usr/lib/libjpeg.so

ln -s /usr/lib/libpng.so.62 /usr/lib/libpng.so



It seems to be a working combination, because it is NOT give you:

1) error message in Apache's error_log:

Module compiled with module API=20020429, debug=0, thread-safety=0

PHP compiled with module API=20020429, debug=0, thread-safety=1



2) error message in Apache's error_log:

[notice] child pid 12345 exit signal Segmentation fault (11)



3) MS Internet Explorer can show PDF-output from your PHP-script via Acrobat plug-in and does not crush. No confusing messages about opening "Adobe Acrobat Control for ActiveX".



Hope it will save you some time.



Good luck,

Boris

matt at nospam dot org
30-Aug-2002 02:11


Adding to my prior note, IE 6 has a strange feature of using GET when refreshing a pdf document, even though the page was originally POSTed to.  This may be the root cause of all the trouble listed above regarding posting and pdf.  



So, I recommend:

1) using a two page form/action handler when doing pdf rendering instead of the standard $PHP_SELF form/self handler to resolve the problem discussed above

2) Using either GET, or a self posting form that sets cookies and then redirects to the pdf creation page instead of POST, so that the parms get to the page.  HTH

gilbertng at hongkong dot com
11-Jun-2002 06:23


Hope it can help someone:



    $pdf = pdf_new();

    //pdf_open_file($pdf,"");

    if (!pdf_open_file($pdf, "")) {

            print error;

            exit;    

    }

    



             PDF_set_parameter($pdf, "resourcefile", "/usr/local/pdflib/fonts/pdflib.upr");

    PDF_set_parameter($pdf,"prefix","/usr/local/pdflib/fonts");



    pdf_begin_page($pdf, 595, 842);

    pdf_add_outline($pdf, "Page 1");



    //pdf_set_font($pdf, "Times-Roman", 30, "host");

             // set chinese characters,

    $font = pdf_findfont($pdf, "MHei-Medium", "B5pc-H",0);

    if ($font) {

        pdf_setfont($pdf, $font, 30);

    }    



    pdf_set_value($pdf, "textrendering",0);

    pdf_show_xy($pdf, " 100 Roman outlined", 50, 750);



    pdf_set_font($pdf, "Times-Roman", 30, "host");

    pdf_show_xy($pdf, " Times Roman outlined", 50, 600);

    pdf_moveto($pdf, 50, 740);

    pdf_lineto($pdf, 330, 740);

    pdf_stroke($pdf);

    pdf_end_page($pdf);

    pdf_close($pdf);



    $buf = pdf_get_buffer($pdf);

    $len = strlen($buf);



    header("Content-type: application/pdf");

    header("Content-Length: $len");

    header("Content-Disposition: inline; filename=foo.pdf");

    print $buf;



    pdf_delete($pdf);

chernyshevsky at hotmail dot com
06-May-2002 06:22


If you're wondering how to highlight words inside a PDF file, take a look at this script I've written (doesn't need PDFLib)





http://zeus.jtlnet.com/~conradis/pdfhi.php.txt





It's a whole lot harder than you think. (Rarely has no much code been written that does so little, that's what I say :-) Worth looking at if you want to do searches inside a PDF.

pbierans at lynet dot de
28-Mar-2002 01:56


Load extension, open a PDF, add a font, modify PDF in memory and send

it to browser:



<?php

  // no cache headers:

  header("Expires: Mon, 26 Jul 1997 05:00:00 GMT");

  header("Last-Modified: ".gmdate("D, d M Y H:i:s")." GMT");

  header("Cache-Control: no-store, no-cache, must-revalidate");

  header("Cache-Control: post-check=0, pre-check=0", false);

  header("Pragma: no-cache");



  $ext_name="libpdf_php.so";

    // libpdf_php.so is the PDFLIB for SunOS by "PDFlib GmbH"

    // visit http://www.pdflib.com



  // if the extension is not automatically loaded by Apache

  // dl() will try to load it on demand:

  if (!extension_loaded($ext_name) && !@dl($ext_name))

  {

    ?>

    <table width="100%" border="0"><tr><td align="center">

      <table style="border: solid #f0f0f0 2px;"><tr>

        <td valign="middle" style="padding: 20px; margin: 0px;">

          <p style="font-family: arial; font-size: 12px; ">

          <b>Sorry,</b><br>

          &nbsp;<br>

          A PDF can not be generated right now.<br>

          The administrator has been informed and will fix this as

          soon as possible.<br>

          Please try again later.

        </p>

      </td></tr></table>

    </td></tr></table>

    <?php

    mail('admin@domain.com','Error: PDFLib not found',

         'Called by script:\n  '.$SCRIPT_FILENAME.'?'.$QUERY_STRING,

         "From: warnings@domain.com\n");

    exit;

  } // verify that extension is usable



  // unique serial number:

  srand(microtime()*10000);

  $usnr= gmdate("Ymd-His-").rand(1000,9999).'-';

  $pdf_file=$usnr.'result.pdf';

  $src_file='source.pdf';



  // create pdf object

  $pdf = pdf_new();

  pdf_open_file($pdf);

  pdf_set_parameter($pdf, 'serial',      'if-you-have-one');



  // fonts to embed, they are in the folder of this file:

  pdf_set_parameter($pdf, 'FontAFM',     'TradeGothic=Tg______.afm');

  pdf_set_parameter($pdf, 'FontOutline', 'TradeGothic=Tg______.pfb');

  pdf_set_parameter($pdf, 'FontPFM',     'TradeGothic=Tg______.pfm');



  // load the source file:

  $src_doc   =pdf_open_pdi($pdf,$src_file,'', 0);

  $src_page  =pdf_open_pdi_page($pdf,$src_doc,1,'');

  $src_width =pdf_get_pdi_value($pdf,'width' ,$src_doc,$src_page,0);

  $src_height=pdf_get_pdi_value($pdf,'height',$src_doc,$src_page,0);



  pdf_begin_page($pdf, $src_width, $src_height);

  {

    // place the sourcefile to the background of the actual page:

    pdf_place_pdi_page($pdf,$src_page,0,0,1,1);

    pdf_close_pdi_page($pdf,$src_page);



    // modify the page:

    pdf_set_font($pdf, 'TradeGothic', 8, 'host');

    pdf_show_xy($pdf, 'Now: '.gmdate("Y-m-d H:i:s"),50,50);

  }

  pdf_end_page($pdf);

  pdf_close($pdf);



  // prepare output:

  $pdfdata = pdf_get_buffer($pdf); // to echo the pdf-data

  $pdfsize = strlen($pdfdata);     // IE requires the datasize



  // real datatype headers:

  header('Content-type: application/pdf');

  header('Content-disposition: attachment; filename="'.$pdf_file.'"');

  header('Content-length: '.$pdfsize);

  echo $pdfdata;

  exit; // keep this one so no #13#10 or #32 will be written

?>

bob at nijman dot de
02-Aug-2001 07:20


Try these tutorials:


************************************


http://www.dynamicwebpages.de/50.tutorials.php?dwp_tutorialID=11


http://www.dynamicwebpages.de/50.tutorials.php?dwp_tutorialID=13


http://www.zend.com/zend/spotlight/creatingpdfmay1.php


http://www.phpbuilder.com/columns/perugini20001026.php3


************************************

a dot marchand dot nospam at home dot com
02-May-2001 03:42


To continue on the internet explorer (Iexplorer, IE) requirements, instead of content-length, a simple:


header("Accept-Ranges: bytes");





is enough for the getpdf.php file working right. Even Netscape will without error with this modification.





Aurelien

add a note