After installation you will find that the “crawler” extension offers a new “Processing Instruction” called “tx_staticpub_publish”. Before you can use that you must:
use either “realurl” or “simulateStaticDocuments” on your site (staticpub will figure out itself what the filenames must be).
configure a publishing directory.
configure which URLs to publish in the crawler
Simply:
Set in localconf.php:$GLOBALS['TYPO3_CONF_VARS']['EXTCONF']['staticpub']['publishDir'] = '_staticpub_/';
Create the directory “_staticpub_” in the root of the TYPO3 website (PATH_site)
You can choose another name than “_staticpub_” for your directory.
With Page TSconfig you can configure which URLs the crawler will try to publish. Please refer to the crawler extension for a detailed understanding of this format. Its a generic one. Here is an example:
tx_crawler.crawlerCfg.paramSets {staticpub = &L=[|_TABLE:pages_language_overlay;_FIELD:sys_language_uid]
staticpub.procInstrFilter = tx_staticpub_publish
staticpub.baseUrl = http://localhost:8888/typo3/dummy_4.0/
}
This configuration is enough to publish URLs which even combines with the various languages available on pages. The published pages of a default configuration like this will be the html-files ONLY - so no resources (images / flash / stylesheets). This mode is useful if you wish to create HTML files for performance improvements on the same server using mod_rewrite. In such a case you would set up some mod_rewrite wizardry that checks if a static file exists in the publish directory, if so, serve that file while all resource files (which are not served by TYPO3) are already on the server.
Alternatively, you may want a variant of this; namely exporting all resources as well. This can be done with a little setting. This modified configuration does that:
tx_crawler.crawlerCfg.paramSets {staticpub = &L=[|_TABLE:pages_language_overlay;_FIELD:sys_language_uid]
staticpub.procInstrFilter = tx_staticpub_publish
staticpub.procInstrParams.tx_staticpub_publish.includeResources=relPath
staticpub.procInstrParams.tx_staticpub_publish.overruleBaseUrl =
staticpub.baseUrl = http://localhost:8888/typo3/dummy_4.0/
}
It asks to include resources. It also asks to set the path relative to these. This is necessary because in this case a base-url is set in the rendered files because we use “realurl”. This is also the reason why the base url is set blank in the second line I introduced above.
The following options are set with “....procInstrParams.tx_staticpub_publish.” as prefix. See also the “crawler” extension documentation for information on “procInstrParams”.
Property: | Data type: | Description: | Default: |
|---|---|---|---|
includeResources | boolean/string | If true, resources (images, flash etc) is also moved to the publish directory. If its a string matching “relPath” the paths to these resources will be prefixed so a base-url can be avoided. | |
overruleBaseUrl | string | If found, this will be the value of the <base> tag URL. If blank: <base> tag is removed if any. Any other value is set as <base href=”VALUE”> | |
checkLinks | boolean | If set, all <a> tag links are checked: - If they link to a directory, “index.html” is appended. - If the file linked to exists - and if not a javascript message is shown instead. Useful option if generating site for offline browsing on a CD. Very important: WIth this option you must publish the site twice in a row because otherwise most links will not exist when checked for and javascript error messages will appear. On the second publishing (or exactly the same) all files have already been created, therefore are found and links are made right! |