WSU logo Arijit Sengupta

* Home  * Research  * Projects  * Publications *
* Courses  * Personal  * Contact  * Links *


The following are current and past projects that I am involved in, or was involved in. Many of the older projects still need some polishing and are still viable, so if any of these projects interest you, feel free to check them out and give me some feedback. Email me if you want to join any of the projects.

  1. Circle - Circuit Classifier (current)
    Circle is a novel classifier for data mining. This is an algorithm based on Circuit minimization techniques, and is able to find classification rules in databases with many factors and a single classifier. It works with string as well as numeric values, and can create non-monotonic rules as well.
    1. Publications:Presented at ACM SAC 2005 Data mining track (poster), IEEE ICARCV '04 (Invited paper)
    2. Planned Submission: To be submitted to Data & Knowledge Engineering, special issue on intelligent data mining (deadline: May 20)
    3. Status: ACTIVE, currently under writing and experimentation process
    4. Collaborator: Mehmet Dalkilic, Informatics, Indiana University
  2. CATPA - Curation and Annotation Tool for Protein Analysis (current)
    CATPA is a complete system for alignment and annotation for genes, including database storage, distributed usage, and methods for adding and searching on annotation.
    1. Publications: Presented at ACM SAC 2005, Bioinformatics track
    2. Planned submission To be submitted to NAR database issue
    3. Status: completed, under publication cycle
    4. Collaborator: Mehmet Dalkilic, Informatics, Indiana University
  3. QBT (Query By Templates)
    QBT is a method for interactive user query formulation on XML databases, based on "shape" of data.
    1. Publications: Presented at IEEE Sym. on Adv. Digital Libraries 1997
    2. Review: Under second round review at IEEE Transactions on Prof. Comm.
    3. Status: completed, under publication cycle
    4. Collaborator: Dr. Andrew Dillon, School of IT, UT Austin
  4. ACXESS (Access Control for XML with Enhanced Security Specifications)
    ACXESS is a method for providing Access control facilities with XML both as native as well as exchange formats.
    1. Publications: none
    2. Review: Under review at VLDB Conference, as full paper as well as demo and Ph.D. Consortium article for Sriram Mohan (my Ph.D. student). Also under review for an NSF grant (NSF IIS Collaborative Systems grant, submitted 5/3/05)
    3. Planned Submission: To be submitted to VLDB Journal special issue on Privacy Preserving Data Management, deadline September 15 2005
    4. Status: ACTIVE, design and implementation stage. NEED HELP! We need to create a model of security based on theory - something I am not entirely comfortable doing myself. Plan is to develop a survey based on this model to validate it.
    5. Collaborator: Dr. Yuqing Wu, Informatics, IU, and Sriram Mohan, CS, IU
  5. MOQ (Measurement Ontology for Quality control) and KROX (Knowledge Representation Ontology for XML)
    MOQ and KROX are a set of ontologies for quality assurance, measurement, and web service workflow strategies.
    1. Publications: To be presented at ECIS 2005, Parts Published at AIS SIGSEMIS Bulletin 2(1), pp 42-46 March 2005
    2. Review: Measurement Ontology under review at Journal of Database Management, Quality Control Ontology under review at JAIS, KROX under review at ITM
    3. Status: ACTIVE - more work to be performed on web services components
    4. Collaborator: Dr. Henry Kim, York University, Canada
  6. XER (Extensible Entity Relationship)
    A conceptual model for XML structures.
    1. Publications: XML 2003 full paper, WITS 2003 Demonstration Session, Idea group book chapter 2005
    2. Review: Under review at JAIS special track on Systems Analysis and Design
    3. Status: ACTIVE - one more submission planned with theoretical results
    4. Collaborator: Sriram Mohan, CS, IU
  7. ST - Semantic Thumbnails (current)
    Semantic Thumbnails provide a visual as well as semantic summary documents. This method is the core of the BioKnOT project.
    1. Publications: Presented at XML 2004 (adapted for XML documents) and at ACM SIGDOC 2004 (for plain text documents)
    2. Planned Submission: To be submitted to JDOC (Emerald Journal of Documentation), possibly by the end of August 2005
    3. Status: ACTIVE, Planned user study in May-June 2005
    4. Collaborator: Mehmet Dalkilic, Informatics, Indiana University
  8. TRACS - TRActable Conference Scheduling (current)
    This is an OR-based method for automating the process of generating conference schedules using information such as reviewer score, and estimated popularity of papers, as well as by placing better papers in higher-attended sessions.
    1. Publications: Presented at DSI 2004
    2. Planned Submission: To be submitted to DSI Journal, eventually
    3. .
    4. Status: NEED HELP! Would really like help from someone with good ops knowledge for deciding on methods for comparing qualities of schedules. Implementation finished, some experimentations needed.
    5. Collaborator: Malvika Gulati, ODT at IU
  9. DocBase II (current)
    DocBase II is a database system for structure documents. Its original version, called DocBase, was the subject of my dissertation. Currently DocBase II is in its second redesign and development phase, with an eventual goal of public release. More information on DocBase II can be found at the Docbase II home page.
    1. Publications: DSQL presented at WITS 2002 (user study) and CAiSE 2002 (language capabilities). DocBase presented at CoMAD 1998, INEX 2004 workshop, also edited book chapter from Idea group 2000
    2. Planned Submission: DocBase to be submitted soon (JSS?) and DSQL final touches are being made for potential submission to JAIS
    3. Collaborator: Ramesh Venkataraman, IS@IU, and Sriram Mohan CS, IU
  10. ADAM - Automated Domain Modeling Assistant (current)
    Adam is the code name for a domain modeling project. In this project, we develop a system that can take multiple domain models in the form of XML representations (which may or may not correspond to an underlying Entity-Relationship Model) and generate a model based on the most probable domain model. More information about Adam can be obtained from the Adam home page.
    1. Publications: WITS 2000 full paper, WITS 2000 Demonstration Session
    2. Planned Submission: Possibly DKE - still looking for alternative venues
    3. Status: Completed. Data collected and processed.
    4. Collaborator: Dr. Sandeep Purao, IST at Penn State and Dr. Veda Storey, CIS at GSU
  11. Information Systems Provenance
    A method for keeping track of software development history across versions as well as development milestones
  12. Mobistik
    This is a "fun" project - only marginally related to my research streams, but thought it would be a good idea, and got into HCI International. Its about improving mobile device interfaces using a pointing device like the trackpoint.
    1. Publications: Accepted at HCI International 2005 (Short paper/extended abstract)
    2. Status: NEED HELP! Need to simulate this device on either a mobile phone or on a laptop and see whether user performance is affected by this new pointing mechanism. Ultimately a nice empirical study.
    3. Collaborator: none.
  13. Fred (dormant)
    Fred is the code name for a Case Based Reasoning modeling tool, in which a user can specify his or her case model either by a conversion process from an existing system or by creating a new model. Fred constructs a rudimentary CBR application where someone can implement the CBR case model and perform Case Retrieval.
  14. E-Cage (dormant)
    Conceptually similar in nature to the above project, E-Cage (Electronic Commerce Application GEnerator) aims at developing a rudimentary Electronic Commerce Application using a base model and extensions provided by the user.