The most valuable asset of IEEE business that attracts members and paying customers to the IEEE is its Electronic Library (IEL). It contains the paper repository known as IEEE Xplore Digital Library with over 4.5 million conference and journal papers. IEEE Xplore includes some 3rd party content, but it mostly consists of papers from IEEE’s owned 186 Transactions, Journals, Letter and Magazines and IEEE conferences, both financially or technically sponsored by IEEE. Included in this most impressive body of scientific and engineering knowledge are legacy issues going back several decades. Moreover, around a quarter million manuscripts are added each year. Much to read and follow!
Although IEEE’s strength is in creating the intellectual property (IP), we format the IP in a classic not to say, traditional, way. The format is raw and unprocessed much like 200 years ago when scientific publishing was incubated. IP nugget is presented in IEEE Xplore as a traditional paper which does not easily help answering questions that a reader might have. It’s disconnected from other papers and the valuable content is buried in text, mixed with formulas, figures, graphs, tables, illustrations.
If we provide our IP users/members with the knowledge and process the IP to get answers to users’ questions, or if a design or algorithm is recommended, this will be of much higher value than purely traditional papers. Papers contain lots of buried, heavily wrapped technical/scientific information as an overhead. Concepts are disconnected or spread between papers and evaluation tools beyond keywords unavailable at IEEE Xplore. As a result, papers are searched (and retrieved) today for isolated keywords much like a bag of unrelated objects that have no associations.
Today’s search techniques used in IEEE Xplore to discover the ‘knowledge’ rely on Author’s, IEEE’s and INSPEC defined keywords. If the keyword of interest is included in the paper Metadata (Title and Abstracts), this paper is returned after the search. Fig. below illustrates the full set of keywords for one of the 949 papers retrieved in IEEE Xplore with the ‘autoencoder’ keyword (2013 T-PAMI, doi: 10.1109/TPAMI.2013.50).
To distinguish between information and knowledge retrieval, let’s look at this example further: Autoencoders (AE) are deep learning/neural network tools proven to be very efficient and accurate in pattern recognition. They come in many variations and each can be used for various tasks. If a researcher is interested in inventing a better AE for handwritten digit recognition or a data engineer wants to program an existing AE that is best for this job, an advanced Boolean search of ((AE) AND (MNIST database of handwritten digits)) on the Metadata would now yield 41 papers in IEEE Xplore. These are papers that comprehensively discuss AEs in context of the popular database MNIST.
Now assume that in addition to this information, a quality criterion is of interest, such as the classification accuracy of the AEs. The knowledge of the specific AEs accuracy is essential in order to select the best AE. With the accuracy as an added quality measure, a refined Boolean search to ((AE) AND (MNIST database of handwritten digits) AND (most accurate)) on Full Text and Metadata can be done. However, this search leads to trouble since as much as 100 papers are returned. An inspection of 100 papers is needed since no search mechanisms exist in IEEE Xplore that would find the most accurate AE model worthy of further consideration. Information is there for us as a pile of papers to scroll unless we have tools to extract the information about accuracy. It’s a missing piece of evaluative knowledge that we need to acquire. Intelligent data analysis could be used at this stage to help with the process of knowledge extraction.
The knowledge about accuracy is hidden in the tables or graphs of each paper, and both are typically reported as statistics of simulations. However, looking for the accuracy digest visually would require combing through hundreds of pages. This could be easily automated by extracting and analyzing the data from papers’ tables and graphs. If the results reported by various papers are ordered by the accuracy results, this would complete the task of knowledge retrieval.
I believe the IEEE can do better on comprehensive knowledge retrieval beyond a simple keyword search or Boolean query. The key question is how to access and reach the most essential knowledge from the information that the users wants but it’s hidden in the mass of papers. The query could be using probabilistic, fuzzy or association rules and analysis of semantic meaning of sentences and analysis of objects such as tables and graphs. For practicing engineers, quantitative measures could be addressed in similar ways to locate the designs or products specifications.
In addition to quantitative knowledge extraction from numbers, modern text search techniques can apply semantic analysis. Methods that are intensely researched in the marketplace include analytics of metadata with proximal searches, clustering using nearest neighbor analysis, visualization of concepts and graph analysis, to name a few of most known techniques. In the long run, scripted searches for the bit of knowledge can not only be automated, but could be initiated using natural language.
I believe we should plan to deliver to our members and subscribers more knowledge as opposed to providing them with the classic information. The AI-aligned IEEE Xplore would reduce the learning curve, discover highly relevant results that people are looking for and are spending considerable amount of time to find.
Venturing into the future of AI tools, we can imagine that conversational systems will eventually help humans interface with computers. So, a query for knowledge can be initiated and an engineer or scientist can be engaged in a back-and-forth dialogue with the machine. After a conversational system receives a message, it could generate alternative responses and rank them.