TIBCO Spotfire® Attivio® AIE® Language Pack Quick Start Guide

TIBCO Spotfire® Attivio® AIE® Language Pack Quick Start Guide
TIBCO Spotfire® Attivio® AIE® Language Pack
Quick Start Guide
Software Release 4.3.1
April 2015
Two-Second Advantage®
2
Important Information
SOME TIBCO SOFTWARE EMBEDS OR BUNDLES OTHER TIBCO SOFTWARE. USE OF SUCH
EMBEDDED OR BUNDLED TIBCO SOFTWARE IS SOLELY TO ENABLE THE FUNCTIONALITY
(OR PROVIDE LIMITED ADD-ON FUNCTIONALITY) OF THE LICENSED TIBCO SOFTWARE. THE
EMBEDDED OR BUNDLED SOFTWARE IS NOT LICENSED TO BE USED OR ACCESSED BY ANY
OTHER TIBCO SOFTWARE OR FOR ANY OTHER PURPOSE.
USE OF TIBCO SOFTWARE AND THIS DOCUMENT IS SUBJECT TO THE TERMS AND
CONDITIONS OF A LICENSE AGREEMENT FOUND IN EITHER A SEPARATELY EXECUTED
SOFTWARE LICENSE AGREEMENT, OR, IF THERE IS NO SUCH SEPARATE AGREEMENT, THE
CLICKWRAP END USER LICENSE AGREEMENT WHICH IS DISPLAYED DURING DOWNLOAD
OR INSTALLATION OF THE SOFTWARE (AND WHICH IS DUPLICATED IN THE LICENSE FILE)
OR IF THERE IS NO SUCH SOFTWARE LICENSE AGREEMENT OR CLICKWRAP END USER
LICENSE AGREEMENT, THE LICENSE(S) LOCATED IN THE “LICENSE” FILE(S) OF THE
SOFTWARE. USE OF THIS DOCUMENT IS SUBJECT TO THOSE TERMS AND CONDITIONS, AND
YOUR USE HEREOF SHALL CONSTITUTE ACCEPTANCE OF AND AN AGREEMENT TO BE
BOUND BY THE SAME.
This document contains confidential information that is subject to U.S. and international copyright laws
and treaties. No part of this document may be reproduced in any form without the written
authorization of TIBCO Software Inc.
TIBCO and Two-Second Advantage are either registered trademarks or trademarks of TIBCO Software
Inc. in the United States and/or other countries.
All other product and company names and marks mentioned in this document are the property of their
respective owners and are mentioned for identification purposes only.
THIS SOFTWARE MAY BE AVAILABLE ON MULTIPLE OPERATING SYSTEMS. HOWEVER, NOT
ALL OPERATING SYSTEM PLATFORMS FOR A SPECIFIC SOFTWARE VERSION ARE RELEASED
AT THE SAME TIME. SEE THE README FILE FOR THE AVAILABILITY OF THIS SOFTWARE
VERSION ON A SPECIFIC OPERATING SYSTEM PLATFORM.
THIS DOCUMENT IS PROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER
EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT.
THIS DOCUMENT COULD INCLUDE TECHNICAL INACCURACIES OR TYPOGRAPHICAL
ERRORS. CHANGES ARE PERIODICALLY ADDED TO THE INFORMATION HEREIN; THESE
CHANGES WILL BE INCORPORATED IN NEW EDITIONS OF THIS DOCUMENT. TIBCO
SOFTWARE INC. MAY MAKE IMPROVEMENTS AND/OR CHANGES IN THE PRODUCT(S)
AND/OR THE PROGRAM(S) DESCRIBED IN THIS DOCUMENT AT ANY TIME.
THE CONTENTS OF THIS DOCUMENT MAY BE MODIFIED AND/OR QUALIFIED, DIRECTLY OR
INDIRECTLY, BY OTHER DOCUMENTATION WHICH ACCOMPANIES THIS SOFTWARE,
INCLUDING BUT NOT LIMITED TO ANY RELEASE NOTES AND "READ ME" FILES.
Copyright © 2010-2015 TIBCO Software Inc. ALL RIGHTS RESERVED.
TIBCO Software Inc. Confidential Information
TIBCO Spotfire® Attivio® AIE® Language Pack Quick Start Guide
3
Contents
TIBCO Documentation and Support Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4
Introduction to TIBCO Spotfire® Attivio® AIE® Language Pack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5
Install the Advanced Linguistics Module (ALM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Install a Language Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Update news Connector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Add Stopwords . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Apply Configuration Changes to the Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Data Access in the Quick Start Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Configuring Information Links (JDBC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Configuring Access to Custom Data Source (ODBC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Tour of the Quick Start Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13
Cover Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Keyphrases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
TIBCO Spotfire® Attivio® AIE® Language Pack Quick Start Guide
4
TIBCO Documentation and Support Services
All TIBCO documentation is available on the TIBCO Documentation site, which can be found here:
https://docs.tibco.com
TIBCO Spotfire Documentation
The following documents which may be relevant when using this product can be found on the TIBCO
Documentation site:
●
TIBCO Spotfire® Attivio AIE Quick Start Guide
●
TIBCO Spotfire® Attivio AIE Language Pack Quick Start Guide
●
TIBCO Spotfire® Attivio AIE Release Notes
●
TIBCO Spotfire® Attivio AIE Language Pack Release Notes
●
TIBCO Spotfire® Professional User's Guide
●
TIBCO Spotfire® Administration Manager - User's Guide
●
TIBCO Spotfire® Deployment and Administration Manual
●
TIBCO Spotfire® License Agreement
Attivio AIE Documentation
For information about the Attivio AIE product, please refer to the documentation on https://
developer.attivio.com. See Getting Access to Attivio AIE Documentation in the TIBCO Spotfire® Attivio
AIE Quick Start Guides for more information.
How to Contact TIBCO Support
For comments or problems with this manual or the software it addresses, contact TIBCO Support as
follows:
●
For an overview of TIBCO Support, and information about getting started with TIBCO Support,
visit this site:
http://www.tibco.com/services/support
●
If you already have a valid maintenance or support contract, visit this site:
https://support.tibco.com
Entry to this site requires a user name and password. If you do not have a user name, you can
request one.
How to Join TIBCOmmunity
TIBCOmmunity is an online destination for TIBCO customers, partners, and resident experts. It is a
place to share and access the collective experience of the TIBCO community. TIBCOmmunity offers
forums, blogs, and access to a variety of resources. To register, go to:
https://www.tibcommunity.com
TIBCO Spotfire® Attivio® AIE® Language Pack Quick Start Guide
5
Introduction to TIBCO Spotfire® Attivio® AIE® Language
Pack
If you want to analyze content with TIBCO Spotfire Attivio AIE in other languages than English, then
you need to deploy the language pack product as well as the main product. This quick start guide
includes an example of how to install and set up the language pack for Spanish, but you can use the
instructions as a basis for installing other language packs as well.
The features available may vary between different languages. See the complete list of features by
language on developer.attivio.com. A good start page for general information about AIE on
developer.attivio.com is AIE Documentation. TIBCO Spotfire Attivio AIE Quick Start Guide for
information about how to get access to developer.attivio.com.
Supported features for Spanish
●
Lemmatization
●
Trainable Classification and Sentiment Analysis
●
Classification Model
●
Sentiment Analysis Model
●
Entity-Sentiment Analysis
●
Key Phrase Extraction
●
More-like-this
●
Result Clustering
●
OCR Module
●
Dictionary-Based Entity Extraction
●
Statistical Entity Extraction
●
Spelling Correction
Not all features are described in this Quick Start Guide.
This Language Pack Quick Start Guide is based on the configuration done in the TIBCO Attivio AIE
Quick Start Guide, so install the Factbook example in that guide prior to following the instructions in
this guide.
The Advanced Language Module can identify a large amount of languages. More advanced analytics is
available in the following languages: Arabic, Chinese (Simplified), Chinese (Traditional), Croatian,
Czech, Danish, Dutch, Finnish, French, German, Greek, Hebrew, Hungarian, Italian, Japanese, Korean,
Latvian, Norwegian (Bokmål), Polish, Portugese, Romanian, Russian, Spanish, Swedish, Thai, Turkish,
Vietnamese.
TIBCO Spotfire® Attivio® AIE® Language Pack Quick Start Guide
6
Install the Advanced Linguistics Module (ALM)
The Advanced Linguistics Module (ALM) is used to improve the linguistic capacities of AIE.
Prerequisites
TIBCO Spotfire Attivio AIE must be installed and configured as described in the TIBCO Spotfire Attivio
AIE Quick Start Guide.
Procedure
1. Extract the file aie-module-alm-4.3.1.95376-win64-aieDist.zip in your attivio installation
directory.
2. In AIE Designer, restart your project by stopping and starting all project servers.
TIBCO Spotfire® Attivio® AIE® Language Pack Quick Start Guide
7
Install a Language Module
Each language module of interest must be installed separately. This example uses the language module
for Spanish, but the instructions can be used for installing any language module if you replace the file
names and language codes with the one of your choice.
Prerequisites
TIBCO Spotfire Attivio AIE must be installed and configured as described in the TIBCO Spotfire Attivio
AIE Quick Start Guide. The install_dir is the directory where you have installed TIBCO Spotfire
Attivio AIE.
Procedure
1. Extract the file lm-0.2-es-BT.zip someware on your computer.
2. Move extract_dir/lib/lm-0.2-es-BT.jar to install_dir/lib.
3. Open the file install_dir\conf\languagemodel\module.xml using a text editor (e.g., Notepad).
4. Add the Spanish map in the models section.
<map name="models">
...
<map name="es">
<property name="1"
<property name="2"
<property name="3"
<property name="4"
<property name="5"
</map>
...
</map>
value="languagemodel/lm/es-BT/1grams.bin"
value="languagemodel/lm/es-BT/2grams.bin"
value="languagemodel/lm/es-BT/3grams.bin"
value="languagemodel/lm/es-BT/4grams.bin"
value="languagemodel/lm/es-BT/5grams.bin"
/>
/>
/>
/>
/>
5. Save the module.xml file.
TIBCO Spotfire® Attivio® AIE® Language Pack Quick Start Guide
8
Update news Connector
Key-phrase Extraction is way to get a summary of a document by extracting its most distinctive words
and phrases. AIE is configured to do this automatically during document ingestion, but the default
languageModelService contains language information for English only. When analyzing other
languages, this has to be enabled for each language. The example below shows how to add sites in
Spanish to the news connector, which is needed if you want to do key-phrase extraction in Spanish.
Prerequisites
TIBCO Spotfire Attivio AIE must be installed and configured as described in the TIBCO Spotfire Attivio
AIE Quick Start Guide. The instructions under Install the Advanced Linguistics Module (ALM) and
Install a Language Module must be completed.
Procedure
1. Open the file install_dir/conf/factbook/content/news.csv using a text editor (e.g., Notepad).
2. Add some Spanish sites to the file. For example:
El Pais,http://ep00.epimg.net/rss/internacional/portada.xml
El Mundo,http://estaticos.elmundo.es/elmundo/rss/internacional.xml
3. Save the news.csv file.
TIBCO Spotfire® Attivio® AIE® Language Pack Quick Start Guide
9
Add Stopwords
Stopwords are common words which you want to filter out when doing a search. For example, in
English it could bewords like "a", "as" or "the". When you add a language module you should also add
a stopword list for that language. These instructions uses an example of how to apply a stopwords list
for Spanish, but you can use the instructions as a basis for adding stopwords to other languages as well.
By using a large stopword list you can reduce the size needed on disk for the index, but you will also
reduce the ability of the system to process queries containing the stopwords.
Prerequisites
TIBCO Spotfire Attivio AIE must be installed and configured as described in the TIBCO Spotfire Attivio
AIE Quick Start Guide. The instructions under Install the Advanced Linguistics Module (ALM) and
Install a Language Module must be completed.
Procedure
1. In AIE Designer, in the Package Explorer, open conf/components/extractKeyPhrases.xml.
2. Add the <property name="es" value="conf/dictionaries/50_stopwords_es.txt"/> entry
with the Spanish stopwords after the entry for English stopwords in the XML source:
<map name="stopWordDictionaries">
<property name="en" value="keyphrases/dictionaries/very_common_words_en.csv"/>
<property name="es" value="conf/dictionaries/50_stopwords_es.txt"/>
</map>
3. Select File > Save.
TIBCO Spotfire® Attivio® AIE® Language Pack Quick Start Guide
10
Apply Configuration Changes to the Project
When you have added language support to a project, you need to deploy the changes and restart the
servers before the updates can be used.
Prerequisites
The instructions under Set Up Key-Phrase Extraction and Add Stopwords should be completed.
Procedure
1. In AIE Designer, click AIE Runtime > Deploy Project Configuration.
2. Select Deploy and restart project nodes and click OK.
When all project servers display the status Running in green, you can continue to the next step.
3. In a web browser, open the AIE Administrator by entering the address of the installation host, the
AIE port number, and at the end add "/admin". For this tutorial, the URL is: http://
<attivio_hostname>:17000/admin.
If you are using a local installation of AIE, then the attivio_hostname is "localhost".
4. On the left side of the AIE Administrator, under System Management, click Connectors.
5. Start the SQLMvfRegeneration connector by selecting the check box to the left of the connector
name, and then click Start.
The keyphrases are updated with in the new language.
6. Start the news connector by selecting the check box to the left of the connector name, and then click
Start.
The number of docs sent should now be higher than the last run, because news are imported from
more sources.
TIBCO Spotfire® Attivio® AIE® Language Pack Quick Start Guide
11
Data Access in the Quick Start Analysis
This section describes how to update the Spotfire data tables in the quick start analysis.
TIBCO Spotfire Attivio AIE Language Pack ships with pre-configured demo analysis files. Follow one
of the procedures below, depending on whether you configured a JDBC or ODBC connection, to ensure
you get access to the data required in the analyses.
●
Configuring Information Links (JDBC)
●
Configuring Access to Custom Data Source (ODBC)
Configuring Information Links (JDBC)
This section describes how to configure information links needed to provide data to the JDBC version
of the example analysis.
Although the example analysis has existing data tables, you must replace the content with information
links for your environment.
Prerequisites
You must have set up JDBC as the connectivity method as described in the TIBCO Spotfire Attivio AIE
Quick Start Guide.
Procedure
1. Open the example analysis TIB_sfire-aa-pk_4.3.1_QuickstartJDBC_Spotfire_6.5.dxp in
Spotfire.
2. A Missing Information Link dialog appears. Select Browse for the missing information link, and
then click OK.
3. In the Select Information Link dialog, select the news information link and then click OK.
Repeat steps 2 and 3 for the keyphrases information link.
4. Select Insert > Columns and make sure that keyphrases is selected under Add columns to data table.
5. From the Select menu, under From Current Analysis, select news.
6. Click Next >.
7. In the Insert Columns – Match columns dialog, click to select AIE_PARENT_DOCID in the From
current data list and AIE_DOCID in the From new data list and then click Match Selected.
8. In the Insert Columns – Import dialog, select the check box for language.
9. Click Finish.
10. Select File > Save.
Configuring Access to Custom Data Source (ODBC)
This section describes how to configure the access to the custom data source in Spotfire needed to
provide data to the ODBC version of the example analysis.
Although the example analysis has existing data tables, you must replace them with content from your
own ODBC environment.
The data table links in the original ODBC example analysis are connected to an ODBC connection
called AIE ODBC connection. If you used a different name while configuring the ODBC connection,
you must replace the table links for this example to work correctly.
TIBCO Spotfire® Attivio® AIE® Language Pack Quick Start Guide
12
Prerequisites
You must have configured your system using ODBC as the connectivity method, as described in the
TIBCO Spotfire Attivio AIE Quick Start Guide.
Procedure
1. Open the example analysis TIB_sfire-att-aie_4.3.1_QuickstartODBC_Spotfire_6.5_ALM.dxp
in Spotfire Analyst. You will receive three error messages about missing data. Ignore these
messages by clicking OK.
2. Select File > Replace Data Table.
3. Set news as the data table to replace and click Select.
4. For an ODBC connection; select Database, set ODBC Data Provider as data source type and click
Configure. Select the System or User data source that you created earlier and click OK.
5. In the Specify Tables and Columns dialog, navigate to tables > news and select the check box for
news. In Data source name type news, and then click OK.
6. In the Replace Data Table dialog, click OK.
7. Repeat steps 2 through 6 to configure the keyphrases table.
8. Select Insert > Columns and make sure that keyphrases is selected under Add columns to data table.
9. From the Select menu, under From Current Analysis, select news.
10. Click Next >.
11. In the Insert Columns – Match columns dialog, click to select AIE_PARENT_DOCID in the From
current data list and AIE_DOCID in the From new data list and then click Match Selected.
12. In the Insert Columns – Import dialog, select the check box for language.
13. Click Finish.
14. Select File > Save.
TIBCO Spotfire® Attivio® AIE® Language Pack Quick Start Guide
13
Tour of the Quick Start Analysis
This section describes how to use the quick start analysis and the key features used.
Cover Page
The cover page text lets you know whether you have opened the JDBC or ODBC analysis template, and
which version of the file you use.
Keyphrases
This page shows an example of how you can set up an analysis which can show keyphrase extraction in
different languages, which you select using a property control in the text area.
Only the English and the Spanish keyphrase extraction are enabled in this example. If you want to view
news in any other language you need to extrapolate the instructions in this guide to configure those
languages as well.
TIBCO Spotfire® Attivio® AIE® Language Pack Quick Start Guide
14
Language
This is a list of some languages in your news flow that potentially could be analyzed. Click on a
language to analyze news in that language. However, in the current example keyphrases only exist
when selecting English or Spanish, since no other languages have been configured in this quick start
guide.
By selecting Spanish or English in the list, keyphrases in selected language will be shown in the
visualization to the right.
Keyphrases
When English or Spanish is selected in the Language list you will see the keyphrases extracted from
your news flow in that particular language. The size of each keyphrase box corresponds to the number
of articles that contain that particular keyphrase.
Articles
If one or more keyphrases are marked in the Keyphrases visualization, articles that include those
keyphrases will be listed in the Articles table. By clicking on the URI (to the far right in the table), the
article will open up in your standard web browser.
TIBCO Spotfire® Attivio® AIE® Language Pack Quick Start Guide
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertising