CYC (www.cyc.com) is an attempt to advance the field of artificial
intelligence (AI) by building an "encyclopedic" scale common sense
knowledge base (KB). The plausible idea is that we know enough about
AI architecture & control, and that until AI systems are good enough
to read English, the main limit on their learning rates is how much
they already know. CYC has been going since 1984, and now contains
about four million "pieces" of knowledge.
The main limit to growing CYC has been people's time to add knowledge.
Also, CYC has long had lots of media coverage and many well-wishers
(so many that they didn't respond most inquiries). Is there a way to
allow and motivate a community of amateurs to add to CYC, while still
leaving its owners (Cycorp) an attractive profit?
Many applications were suggested for the internet, but folks were
surprised to see email dominating use for decades. Many applications
were also envisioned for hypertext publishing (i.e., the web), but
folks were surprised to see home pages dominating early use. These
surprises suggest to me that personal social uses might jump start
an "open CYC."
Imagine amateurs using a public CYC language (e.g., CYCL and the upper
ontology) to post small KBs about themselves and their pets, homes,
politics, family, friends, etc. Imagine Cycorp licensing public CYC
servers, analogous to web search engines, which collect posted KBs
and respond to queries asking for people, places, etc. meeting given
descriptions.
A query asking, among other things, for people into sports, or for
people who like mountains, might come back with people who like to ski,
*if* the server understood that skiing is a sport done in the
mountains. Compared to a standard database of personal descriptions,
CYC-like common sense would allow people to better find each other.
Initially the public servers would not understand many things. This
would encourage ski fans to make and post KBs about skiing, and mountain
fans to post KBs explaining mountains. As with web pages, amateurs
KBs might cover topics as diverse as chess, pot stickers, and the
Titanic. Amateur KBs would give bragging rights to their creators and
help people with stronger interests in each area to find and socialize
with each other. And those social ties, after all, are a big part of
what induced people to make topical mailing lists and web pages.
If a social open CYC "took off" like email & home pages did, standard
"network effect" arguments suggest that vast numbers of amateur KBs
should increase the value to Cycorp of their private KB. Any loss
from cheaply licensing CYC for social queries should be more than
compensated by increased demand for other uses.
There are of course many details to be considered:
1) CYC has periodically undergone major reorganizations of its whole
KB. A large amateur community would frown on this. Is CYC stable
enough yet?
2) How simple can a version of CYC be that is enough to support
amateurs in building personal KBs? What is tradeoff between power
and ease of use?
3) Can a natural line be drawn between the space of "personal
social queries" which servers might be licensed to cheaply answer
and other sorts of queries Cycorp hopes to make most of their
profit on by pricing differently?
4) How can amateurs do quality control among themselves? Simple
votes among amateurs should be sufficient to settle dispute about
what the "common sense" answer to a query is. Can credit assignment
of query answers to KB elements be made precise enough to translate
consensus on query answers into consensus on KB elements?
5) Can amateur KB builders be allowed to inspect inference details
in order to debug their KBs without allowing theft of the entire
private CYC KB?
6) Can someone, hoping to help a CYC server understand him or her,
reveal personal details to that server that he or she would not want
published for all to see?
FYI, there is a mailing list on related issues:
"The Public Domain Knowledge Bank mailing list
(pdkb@ldl.healthpartners.com)
has the purpose of support, creation and maintenance of a common
sense knowledge base (Public Domain Knowledge Bank also known as PDKB)
available to the any person who wants to use it."
Robin Hanson
hanson@econ.berkeley.edu http://hanson.berkeley.edu/
RWJF Health Policy Scholar FAX: 510-643-8614
140 Warren Hall, UC Berkeley, CA 94720-7360 510-643-1884
after 8/99: Assist. Prof. Economics, George Mason Univ.
[To drop AltInst, tell: majordomo@cco.caltech.edu to: unsubscribe altinst]
Received on Fri Mar 5 19:57:52 1999
This archive was generated by hypermail 2.1.8 : Tue Mar 07 2006 - 14:49:12 PST