DataGen Parameters

  1. Overview
  2. Predicting Attributes
  3. Rules
  4. Tuples
  5. General

  1. Overview
  2. The DatGen parameters modify both the characteristics of the generated domain theory and the way the theory is applied to generate events. Rules are composed of head and body were the head is a single predicted attribute-value and the body is composed of one or more predicting attribute-values.

  3. Predicting Attributes
  4. In synthetic classification data sets there is one predicted attribute and the remaining predicting attributes. The quantity and distribution of the predicted attribute is determined by the rule base. Predicting attribute characteristics include their quantity, datatype and domain size. Furthermore some common real world disturbance should also be modeled including missing relevant attributes and completely irrelevant attributes.


  5. Rules
  6. The generated domain theory is composed of conjunctive normal form rules. These parameters set the number of such rules, their qualitative structure and how they will be invoked to create events.

  7. Tuples
  8. Once the predicting attributes and the rule base are defined the system can proceed to generate data tuples (records). As the number of tuples is increased discovery tools will be able to make better predictions because of the increased information.


  9. General



Home Page: http://www.datasetgenerator.com
Comments: melli@cs.sfu.ca