author_facet Nado, Robert A.
Huffman, Scott B.
Nado, Robert A.
Huffman, Scott B.
author Nado, Robert A.
Huffman, Scott B.
spellingShingle Nado, Robert A.
Huffman, Scott B.
ACM SIGMOD Record
Extracting entity profiles from semistructured information spaces
Information Systems
Software
author_sort nado, robert a.
spelling Nado, Robert A. Huffman, Scott B. 0163-5808 Association for Computing Machinery (ACM) Information Systems Software http://dx.doi.org/10.1145/271074.271083 <jats:p> A semistructured information space consists of multiple collections of textual documents containing fielded or tagged sections. The space can be highly heterogeneous, because each collection has its own schema, and there are no enforced keys or formats for data items across collections. Thus, structured methods like SQL cannot be easily employed, and users often must make do with only full-text search. In this paper, we describe an approach that provides structured querying for particular types of <jats:italic>entities</jats:italic> , such as companies and people. Entity-based retrieval is enabled by <jats:italic>normalizing</jats:italic> entity references in a heuristic, type-dependent manner. The approach can be used to retrieve documents and can also be used to construct entity profiles — summaries of commonly sought information about an entity based on the documents' content. The approach requires only a modest amount of meta-information about the source collections, much of which is derived automatically. </jats:p> Extracting entity profiles from semistructured information spaces ACM SIGMOD Record
doi_str_mv 10.1145/271074.271083
facet_avail Online
Free
finc_class_facet Informatik
format ElectronicArticle
fullrecord blob:ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTE0NS8yNzEwNzQuMjcxMDgz
id ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTE0NS8yNzEwNzQuMjcxMDgz
institution DE-L229
DE-D275
DE-Bn3
DE-Brt1
DE-D161
DE-Zwi2
DE-Gla1
DE-Zi4
DE-15
DE-Pl11
DE-Rs1
FID-BBI-DE-23
DE-105
DE-14
DE-Ch1
imprint Association for Computing Machinery (ACM), 1997
imprint_str_mv Association for Computing Machinery (ACM), 1997
issn 0163-5808
issn_str_mv 0163-5808
language English
mega_collection Association for Computing Machinery (ACM) (CrossRef)
match_str nado1997extractingentityprofilesfromsemistructuredinformationspaces
publishDateSort 1997
publisher Association for Computing Machinery (ACM)
recordtype ai
record_format ai
series ACM SIGMOD Record
source_id 49
title Extracting entity profiles from semistructured information spaces
title_unstemmed Extracting entity profiles from semistructured information spaces
title_full Extracting entity profiles from semistructured information spaces
title_fullStr Extracting entity profiles from semistructured information spaces
title_full_unstemmed Extracting entity profiles from semistructured information spaces
title_short Extracting entity profiles from semistructured information spaces
title_sort extracting entity profiles from semistructured information spaces
topic Information Systems
Software
url http://dx.doi.org/10.1145/271074.271083
publishDate 1997
physical 32-38
description <jats:p> A semistructured information space consists of multiple collections of textual documents containing fielded or tagged sections. The space can be highly heterogeneous, because each collection has its own schema, and there are no enforced keys or formats for data items across collections. Thus, structured methods like SQL cannot be easily employed, and users often must make do with only full-text search. In this paper, we describe an approach that provides structured querying for particular types of <jats:italic>entities</jats:italic> , such as companies and people. Entity-based retrieval is enabled by <jats:italic>normalizing</jats:italic> entity references in a heuristic, type-dependent manner. The approach can be used to retrieve documents and can also be used to construct entity profiles — summaries of commonly sought information about an entity based on the documents' content. The approach requires only a modest amount of meta-information about the source collections, much of which is derived automatically. </jats:p>
container_issue 4
container_start_page 32
container_title ACM SIGMOD Record
container_volume 26
format_de105 Article, E-Article
format_de14 Article, E-Article
format_de15 Article, E-Article
format_de520 Article, E-Article
format_de540 Article, E-Article
format_dech1 Article, E-Article
format_ded117 Article, E-Article
format_degla1 E-Article
format_del152 Buch
format_del189 Article, E-Article
format_dezi4 Article
format_dezwi2 Article, E-Article
format_finc Article, E-Article
format_nrw Article, E-Article
_version_ 1792329273079496713
geogr_code not assigned
last_indexed 2024-03-01T13:06:29.905Z
geogr_code_person not assigned
openURL url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fvufind.svn.sourceforge.net%3Agenerator&rft.title=Extracting+entity+profiles+from+semistructured+information+spaces&rft.date=1997-12-01&genre=article&issn=0163-5808&volume=26&issue=4&spage=32&epage=38&pages=32-38&jtitle=ACM+SIGMOD+Record&atitle=Extracting+entity+profiles+from+semistructured+information+spaces&aulast=Huffman&aufirst=Scott+B.&rft_id=info%3Adoi%2F10.1145%2F271074.271083&rft.language%5B0%5D=eng
SOLR
_version_ 1792329273079496713
author Nado, Robert A., Huffman, Scott B.
author_facet Nado, Robert A., Huffman, Scott B., Nado, Robert A., Huffman, Scott B.
author_sort nado, robert a.
container_issue 4
container_start_page 32
container_title ACM SIGMOD Record
container_volume 26
description <jats:p> A semistructured information space consists of multiple collections of textual documents containing fielded or tagged sections. The space can be highly heterogeneous, because each collection has its own schema, and there are no enforced keys or formats for data items across collections. Thus, structured methods like SQL cannot be easily employed, and users often must make do with only full-text search. In this paper, we describe an approach that provides structured querying for particular types of <jats:italic>entities</jats:italic> , such as companies and people. Entity-based retrieval is enabled by <jats:italic>normalizing</jats:italic> entity references in a heuristic, type-dependent manner. The approach can be used to retrieve documents and can also be used to construct entity profiles — summaries of commonly sought information about an entity based on the documents' content. The approach requires only a modest amount of meta-information about the source collections, much of which is derived automatically. </jats:p>
doi_str_mv 10.1145/271074.271083
facet_avail Online, Free
finc_class_facet Informatik
format ElectronicArticle
format_de105 Article, E-Article
format_de14 Article, E-Article
format_de15 Article, E-Article
format_de520 Article, E-Article
format_de540 Article, E-Article
format_dech1 Article, E-Article
format_ded117 Article, E-Article
format_degla1 E-Article
format_del152 Buch
format_del189 Article, E-Article
format_dezi4 Article
format_dezwi2 Article, E-Article
format_finc Article, E-Article
format_nrw Article, E-Article
geogr_code not assigned
geogr_code_person not assigned
id ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTE0NS8yNzEwNzQuMjcxMDgz
imprint Association for Computing Machinery (ACM), 1997
imprint_str_mv Association for Computing Machinery (ACM), 1997
institution DE-L229, DE-D275, DE-Bn3, DE-Brt1, DE-D161, DE-Zwi2, DE-Gla1, DE-Zi4, DE-15, DE-Pl11, DE-Rs1, FID-BBI-DE-23, DE-105, DE-14, DE-Ch1
issn 0163-5808
issn_str_mv 0163-5808
language English
last_indexed 2024-03-01T13:06:29.905Z
match_str nado1997extractingentityprofilesfromsemistructuredinformationspaces
mega_collection Association for Computing Machinery (ACM) (CrossRef)
physical 32-38
publishDate 1997
publishDateSort 1997
publisher Association for Computing Machinery (ACM)
record_format ai
recordtype ai
series ACM SIGMOD Record
source_id 49
spelling Nado, Robert A. Huffman, Scott B. 0163-5808 Association for Computing Machinery (ACM) Information Systems Software http://dx.doi.org/10.1145/271074.271083 <jats:p> A semistructured information space consists of multiple collections of textual documents containing fielded or tagged sections. The space can be highly heterogeneous, because each collection has its own schema, and there are no enforced keys or formats for data items across collections. Thus, structured methods like SQL cannot be easily employed, and users often must make do with only full-text search. In this paper, we describe an approach that provides structured querying for particular types of <jats:italic>entities</jats:italic> , such as companies and people. Entity-based retrieval is enabled by <jats:italic>normalizing</jats:italic> entity references in a heuristic, type-dependent manner. The approach can be used to retrieve documents and can also be used to construct entity profiles — summaries of commonly sought information about an entity based on the documents' content. The approach requires only a modest amount of meta-information about the source collections, much of which is derived automatically. </jats:p> Extracting entity profiles from semistructured information spaces ACM SIGMOD Record
spellingShingle Nado, Robert A., Huffman, Scott B., ACM SIGMOD Record, Extracting entity profiles from semistructured information spaces, Information Systems, Software
title Extracting entity profiles from semistructured information spaces
title_full Extracting entity profiles from semistructured information spaces
title_fullStr Extracting entity profiles from semistructured information spaces
title_full_unstemmed Extracting entity profiles from semistructured information spaces
title_short Extracting entity profiles from semistructured information spaces
title_sort extracting entity profiles from semistructured information spaces
title_unstemmed Extracting entity profiles from semistructured information spaces
topic Information Systems, Software
url http://dx.doi.org/10.1145/271074.271083