Extracting entity profiles from semistructured information spaces

Gespeichert in:

Bibliographische Detailangaben
Zeitschriftentitel:	ACM SIGMOD Record
Personen und Körperschaften:	Nado, Robert A., Huffman, Scott B.
In:	ACM SIGMOD Record, 26, 1997, 4, S. 32-38
Format:	E-Article
Sprache:	Englisch
veröffentlicht:	Association for Computing Machinery (ACM)
Schlagwörter:	Information Systems Software

author_facet	Nado, Robert A. Huffman, Scott B. Nado, Robert A. Huffman, Scott B.
author	Nado, Robert A. Huffman, Scott B.
spellingShingle	Nado, Robert A. Huffman, Scott B. ACM SIGMOD Record Extracting entity profiles from semistructured information spaces Information Systems Software
author_sort	nado, robert a.
spelling	Nado, Robert A. Huffman, Scott B. 0163-5808 Association for Computing Machinery (ACM) Information Systems Software http://dx.doi.org/10.1145/271074.271083 <jats:p> A semistructured information space consists of multiple collections of textual documents containing fielded or tagged sections. The space can be highly heterogeneous, because each collection has its own schema, and there are no enforced keys or formats for data items across collections. Thus, structured methods like SQL cannot be easily employed, and users often must make do with only full-text search. In this paper, we describe an approach that provides structured querying for particular types of <jats:italic>entities</jats:italic> , such as companies and people. Entity-based retrieval is enabled by <jats:italic>normalizing</jats:italic> entity references in a heuristic, type-dependent manner. The approach can be used to retrieve documents and can also be used to construct entity profiles — summaries of commonly sought information about an entity based on the documents' content. The approach requires only a modest amount of meta-information about the source collections, much of which is derived automatically. </jats:p> Extracting entity profiles from semistructured information spaces ACM SIGMOD Record
doi_str_mv	10.1145/271074.271083
facet_avail	Online Free
finc_class_facet	Informatik
format	ElectronicArticle
fullrecord	blob:ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTE0NS8yNzEwNzQuMjcxMDgz
id	ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTE0NS8yNzEwNzQuMjcxMDgz
institution	DE-L229 DE-D275 DE-Bn3 DE-Brt1 DE-D161 DE-Zwi2 DE-Gla1 DE-Zi4 DE-15 DE-Pl11 DE-Rs1 FID-BBI-DE-23 DE-105 DE-14 DE-Ch1
imprint	Association for Computing Machinery (ACM), 1997
imprint_str_mv	Association for Computing Machinery (ACM), 1997
issn	0163-5808
issn_str_mv	0163-5808
language	English
mega_collection	Association for Computing Machinery (ACM) (CrossRef)
match_str	nado1997extractingentityprofilesfromsemistructuredinformationspaces
publishDateSort	1997
publisher	Association for Computing Machinery (ACM)
recordtype	ai
record_format	ai
series	ACM SIGMOD Record
source_id	49
title	Extracting entity profiles from semistructured information spaces
title_unstemmed	Extracting entity profiles from semistructured information spaces
title_full	Extracting entity profiles from semistructured information spaces
title_fullStr	Extracting entity profiles from semistructured information spaces
title_full_unstemmed	Extracting entity profiles from semistructured information spaces
title_short	Extracting entity profiles from semistructured information spaces
title_sort	extracting entity profiles from semistructured information spaces
topic	Information Systems Software
url	http://dx.doi.org/10.1145/271074.271083
publishDate	1997
physical	32-38
description	<jats:p> A semistructured information space consists of multiple collections of textual documents containing fielded or tagged sections. The space can be highly heterogeneous, because each collection has its own schema, and there are no enforced keys or formats for data items across collections. Thus, structured methods like SQL cannot be easily employed, and users often must make do with only full-text search. In this paper, we describe an approach that provides structured querying for particular types of <jats:italic>entities</jats:italic> , such as companies and people. Entity-based retrieval is enabled by <jats:italic>normalizing</jats:italic> entity references in a heuristic, type-dependent manner. The approach can be used to retrieve documents and can also be used to construct entity profiles — summaries of commonly sought information about an entity based on the documents' content. The approach requires only a modest amount of meta-information about the source collections, much of which is derived automatically. </jats:p>
container_issue	4
container_start_page	32
container_title	ACM SIGMOD Record
container_volume	26
format_de105	Article, E-Article
format_de14	Article, E-Article
format_de15	Article, E-Article
format_de520	Article, E-Article
format_de540	Article, E-Article
format_dech1	Article, E-Article
format_ded117	Article, E-Article
format_degla1	E-Article
format_del152	Buch
format_del189	Article, E-Article
format_dezi4	Article
format_dezwi2	Article, E-Article
format_finc	Article, E-Article
format_nrw	Article, E-Article
_version_	1792329273079496713
geogr_code	not assigned
last_indexed	2024-03-01T13:06:29.905Z
geogr_code_person	not assigned
openURL	url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fvufind.svn.sourceforge.net%3Agenerator&rft.title=Extracting+entity+profiles+from+semistructured+information+spaces&rft.date=1997-12-01&genre=article&issn=0163-5808&volume=26&issue=4&spage=32&epage=38&pages=32-38&jtitle=ACM+SIGMOD+Record&atitle=Extracting+entity+profiles+from+semistructured+information+spaces&aulast=Huffman&aufirst=Scott+B.&rft_id=info%3Adoi%2F10.1145%2F271074.271083&rft.language%5B0%5D=eng
SOLR
_version_	1792329273079496713
author	Nado, Robert A., Huffman, Scott B.
author_facet	Nado, Robert A., Huffman, Scott B., Nado, Robert A., Huffman, Scott B.
author_sort	nado, robert a.
container_issue	4
container_start_page	32
container_title	ACM SIGMOD Record
container_volume	26
description	<jats:p> A semistructured information space consists of multiple collections of textual documents containing fielded or tagged sections. The space can be highly heterogeneous, because each collection has its own schema, and there are no enforced keys or formats for data items across collections. Thus, structured methods like SQL cannot be easily employed, and users often must make do with only full-text search. In this paper, we describe an approach that provides structured querying for particular types of <jats:italic>entities</jats:italic> , such as companies and people. Entity-based retrieval is enabled by <jats:italic>normalizing</jats:italic> entity references in a heuristic, type-dependent manner. The approach can be used to retrieve documents and can also be used to construct entity profiles — summaries of commonly sought information about an entity based on the documents' content. The approach requires only a modest amount of meta-information about the source collections, much of which is derived automatically. </jats:p>
doi_str_mv	10.1145/271074.271083
facet_avail	Online, Free
finc_class_facet	Informatik
format	ElectronicArticle
format_de105	Article, E-Article
format_de14	Article, E-Article
format_de15	Article, E-Article
format_de520	Article, E-Article
format_de540	Article, E-Article
format_dech1	Article, E-Article
format_ded117	Article, E-Article
format_degla1	E-Article
format_del152	Buch
format_del189	Article, E-Article
format_dezi4	Article
format_dezwi2	Article, E-Article
format_finc	Article, E-Article
format_nrw	Article, E-Article
geogr_code	not assigned
geogr_code_person	not assigned
id	ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTE0NS8yNzEwNzQuMjcxMDgz
imprint	Association for Computing Machinery (ACM), 1997
imprint_str_mv	Association for Computing Machinery (ACM), 1997
institution	DE-L229, DE-D275, DE-Bn3, DE-Brt1, DE-D161, DE-Zwi2, DE-Gla1, DE-Zi4, DE-15, DE-Pl11, DE-Rs1, FID-BBI-DE-23, DE-105, DE-14, DE-Ch1
issn	0163-5808
issn_str_mv	0163-5808
language	English
last_indexed	2024-03-01T13:06:29.905Z
match_str	nado1997extractingentityprofilesfromsemistructuredinformationspaces
mega_collection	Association for Computing Machinery (ACM) (CrossRef)
physical	32-38
publishDate	1997
publishDateSort	1997
publisher	Association for Computing Machinery (ACM)
record_format	ai
recordtype	ai
series	ACM SIGMOD Record
source_id	49
spelling	Nado, Robert A. Huffman, Scott B. 0163-5808 Association for Computing Machinery (ACM) Information Systems Software http://dx.doi.org/10.1145/271074.271083 <jats:p> A semistructured information space consists of multiple collections of textual documents containing fielded or tagged sections. The space can be highly heterogeneous, because each collection has its own schema, and there are no enforced keys or formats for data items across collections. Thus, structured methods like SQL cannot be easily employed, and users often must make do with only full-text search. In this paper, we describe an approach that provides structured querying for particular types of <jats:italic>entities</jats:italic> , such as companies and people. Entity-based retrieval is enabled by <jats:italic>normalizing</jats:italic> entity references in a heuristic, type-dependent manner. The approach can be used to retrieve documents and can also be used to construct entity profiles — summaries of commonly sought information about an entity based on the documents' content. The approach requires only a modest amount of meta-information about the source collections, much of which is derived automatically. </jats:p> Extracting entity profiles from semistructured information spaces ACM SIGMOD Record
spellingShingle	Nado, Robert A., Huffman, Scott B., ACM SIGMOD Record, Extracting entity profiles from semistructured information spaces, Information Systems, Software
title	Extracting entity profiles from semistructured information spaces
title_full	Extracting entity profiles from semistructured information spaces
title_fullStr	Extracting entity profiles from semistructured information spaces
title_full_unstemmed	Extracting entity profiles from semistructured information spaces
title_short	Extracting entity profiles from semistructured information spaces
title_sort	extracting entity profiles from semistructured information spaces
title_unstemmed	Extracting entity profiles from semistructured information spaces
topic	Information Systems, Software
url	http://dx.doi.org/10.1145/271074.271083