Paper 13

Paper Title: Domain-Driven, Actionable Knowledge Discovery

Authors: Longbing Cao, University of Technology, Sydney

Three Critical Questions


Group 1:

Member Name: Abilash.Amarthaluri

1. It is very hard to cluster the data efficiently and accurately. It is also a big problem for the short term and the long term predictions of the knowledge domain and discovery as the processing techniques used may introduce the lag in the mining process. What could be done to make them easy to process without lag?

2. It is not so easy to automate the data mining process there can be a need of methodology for extracting the data and there can also be constraints how to ensure the users privacy during data mining process?

3. We need to consider also the quality of knowledge from the data mining process where we need to consider the quality of knowledge extracted and are the results we get when we mine data constant i.e., all the time we extract the data will we get the same outcomes.

4. There is a problem that the data is also available in the network for all the people. What if the intruders of the network glance in to the data and corrupt the data.

Group 2:

Member Name: Sai ram Kota

Q1:The authors only talk about one sort of algorithm used for text mining process which is the social filtering algorithm, he doesn't talk about why he used only this algorithm , are there any other algorithms which can implement this system, is there any special advantage over others…?

Q2: The authors do not say anything about how they will determine whether the skill of consumer is beginner, professional or expert will it be discovered based on the user sentence pattern or on is it determined in any other way..?

Q3: Since the authors are using text mining software and an ontology to extract the information from reviews they will be using a knowledge base. Here the authors did not talk about identifying new ontology's based on existing reviews and any new unknown concepts encountered. Extension of knowledgebaseisalsonotdiscussed..?

Group 3:

Member Name: Sunil Kakaraparthi, Yaswanth Kantamaneni

1. Actionable Knowledge Discovery should fit a framework which should discover the interestingness in the technological and business related aspect. How does the complexity of the framework handled and how does the issue is being handled by the author?
2. The domain-driven data mining technique discloses applications and identifies the challenges and directions for the future purposes. What are the constraints and difficulties faced by the developers in disclosing the applications and what are the measures to be taken in the deploying those applications?
3. There is a process of integrating the domain intelligence with the automated construction of activity sequence of government customer contracts. How does the integration process performed and what are the issues to be considered in the integration?
4. There are Technological and Business interestingness for an activity sequence associated with the occurrence of the debt. Is there any other interestingness that would occur for an activity sequence which would increase the quality and decrease the cost effectiveness of the output for an activity?

Group 4:

Member Name:Prashant Sunkari

• The paper stated that the “traditional KDD” uses the “trial and error process” to mine the data. But how the different types of data is mined and is this process efficient to mine the complex domains?
• The author stated that different algorithms are used to translate the models into actions. The usage of these algorithms limits the actions to be obtained as there are limited by costs etc which leads to “NP” hard problems. How this problem is resolved is not stated in the paper.
• The output of the data mining is obtained in terms of statistical data. How this data is used to obtain actions?
• The “metasynthesis” technique is used to achieve mining using different methodologies. But how these methodologies are differentiated for metadata and for larger domains?

Group 5:

Member Name: Lokesh Reddy Gokul

• In the learning actions section, authors speak of ARMS where they use the idea of generating test data for various possible actions and finding corresponding parameters, actions, preconditions and constraint sets. This may eventually be too tiresome a task to generate and store all these test data and also such an idea may lead to observation that the knowledge and intelligence becomes a side effect of scale rather than the intelligence of algorithm or method being used. So how can such a technique prove efficient on a long-term basis?
• In the learning actions section, the author discusses the need and ways of moving an user form a not stay status to the stay status using the decision trees of invariable depth, where he doesn’t consider the trade-off between the complexity of the decision tree which may involve the additional costs incurred in forcing the user to transform to other state when he is not willing to do so for a long time. How can such a trade-off affect the efficiency of this technique?
• In the technology push section, where author talks of tracking a object or system or deciding the next moves which may prove efficient for it’s purpose. In such a case, he doesn’t mention the details required to take care of the noise levels in the scenario which may or may not be useful depending on the scenario. The author doesn’t consider the aspect of noise playing a role which may be possible in some cases. Is such a consideration important for a more accurate model? What are the corresponding pros and cons?

Group 6:

1)When we implement a domain driven design for designing software, we need to concentrate more on the domain specific than the actual data we might deal. For the complex applications where the input data we deal is not in a restricted boundary. How to deal such situations?
2) In the actionable data mining for video and visual systems, the sequence actions are used to predict and make the decision like in traffic systems. When each user exhibits different behavior how system can predict it?
3)Data miming only identifies the connection between entities not the casual relationship. When this design is used for developing software with domain driven model, how can that address the problems in that domain when it does not even know the relationship between the actors?

Group 7:

Member Name: Kishore Kumar Mannava

1. In the second technique the “author” mentions that we need to learn the “action models” from the predefined “frequent item sets”. He also mentions that construction of those models is often very tedious. Then why does the author propose this technique for “actionable data mining”?
2. The “traditional data mining” algorithms focuses mainly on producing the “statistical data” and ‘test data”. But the industrial needs today requires that the techniques should produce “actions” that are executed repeatedly. Then why are we still using the “traditional” algorithms for various requirements?
3. The “decision tree” technique provides a boundary on the number of “actions’ or their overall cost which leads to the creation of the “NP – Hard” problem. Then why are we still using this method even when its “performance” is very low?
4. The “video and visualization” of system leads to many problems such as producing the “unstructured data”, problems in assembling of the various patterns and also inculcates in it a lot of “noise” and disturbance during transmission. Then why to use such technique for “mining” the “Actionable data”?

Group 8:

Member Name:N.Lakshmi Bhavani

1. It is stated “data mining aims typically at producing “statistical models” for result How the actions are generated and executed automatically from the statistical models.
2. It is stated that the video data varies among the individuals and episode to episode how the “next move” or the “idiosyncratic behavior” can be determined. And how the “noise and irrelevant” data can be removed.
3. The ARMS carries on in two phases and the action sets are derived from the previous action sets and are used as the inputs and transforms into constraints. How the uncertainty is handled in these conditions when there is an uncertainty in the phase one.
4. How the actions are derived from the frequent action sets. As the system requires the predefined action models with an initial state and moreover defining these models are difficult.


Group 1:

Member Name: lattupalli,pelluri,voruganti

The solution author proposed for extracting actions from decision trees limits the number of actions. Consider a situation where two customers with the same probability exists, then there raises problem in moving them in to another node if there is only one slot remaining (as it is limited in number of actions performed). How author is going to deal such type of situations?

If the gap between the Research results and business expectations is targeted problems then what are the problems the research teams have concentrated on?

Author explained the outputs of the domain driven data mining but what is the methodology he is using to capture real time human involvements.

Group 2:

Member Name:Addagalla, Bobbili, Gopinath

• Domain-driven data mining has lot of steps like extracting action model, frequent action sequences, decision trees etc. Will this data extraction be time consuming and what is the possibility of loss of data when going through these various steps?
• In the paper the author gives the example of the traffic signal which follows transductive reasoning (active patterns). The roads may change due to construction or there may be traffic diversions. How will the active patterns differ due to this?
• The author gives the example of government social security overpayments. When dealing with social security from huge amount of data what is the level of validation and authentication is done by domain-driven data mining?
• The author projects domain-driven data mining as a means to ensure seamless integration of the academic research into the business world. The question is whether domain-driven data mining is a sure-shot solution to bridging the gap between academia and business?

Group 3:

Member Name: Swati Thorve

1) What is the exact difference between the data-driven and domain-driven data mining and their approaches?

2) What do we define as the characteristics of a domain and how they should be captured in a framework for generating actionable knowledge? Are their fixed sets of methods or do they change as the problem domain changes?

3) Are there any ways to create generalizations over a series of problems? Can the frameworks be reusable?

Group 4:

Member Name:Shaiv, karuna priya rameshwaram, anusha vunnam

1) consider a data mining system where in it is true in producing actionable methods, but is it really needs to be executed as both semi automatic and automatic in case of generating actions?
2) In extracting actions from decision trees, the author proposed two techniques and both the techniques are including most complicated methods. What about the economic cost of including such methods?
3) Video and vision system is having its own requirements in various aspects. Would it be possible to provide all those complicated requirements to it?

Group 5:

Member Name

1. What are the ways that Domain-Driven Data Mining uses to identify the domain problems and what are the factors that are considered for identifying those problems?
2. What techniques are used to undercover and disclose applications to solve practical problems?
3. Which of the several criteria are considered while making changes to the attributes to move a customer to the desired status?

Group 6:

Member Name:

Group 7:

Member Name:

Group 8:

Member Name:Bhargav Sandeep Ramayanam

1. In the paper the author had mentioned mapping only for the metric distance cases. Is the mapping is valid only for this or valid for other cases also? Because the circles which he designed will guarantee to intersect with respect to the two reference points.
2. In the paper the author had presented the DNA representation and the visualization for the smaller instances of the data only. Will this be applicable for the larger data also? Because if the data goes on increasing the complexity of the representation and visualization also increases.
3. In the brain informatics the author had mentioned about the cognitive process and the neurological process. Also he proposed some statements by combining both of them. Will both of them support each other? Because in some cases neurological process won’t support cognitive process.

Group 9:

Member Name: Satish Bhat, Holly Vo

1. Domain Driven KDD is still in is infancy. The lack of a proper methodology or framework makes its future skeptical.
2. Domain Driven KDD can be a complex process. The paper does not explain the details about the application efficiency when KDD is integrated with legacy systems. Performance analysis details are also not mentioned
3. Techniques used for actionable data mining are specific to an application. So each time actionable data is required, domain experts need to follow an elaborate process. Can we create a neither methodology that is neither to specific nor general?

Group 10:

Member Name: Sunae Shin, Hyungbae Park

1. The definition of actionable knowledge discovery is too abstract. They made a simple notation to show the big picture of the actionable knowledge discovery. However, there are no additional explanations for the detailed pictures to understand exactly what the actionable knowledge discovery is.
2. In applications section of the first article, they showed an example how domain-driven data mining is used. This helps me understand more clearly what the domain-driven data mining is.
3. The second article lacks the reason why we should use actionable data mining rather than techniques which produce data centric outputs lack.
4. In the learning from frequent-action sequences they applied a frequent-item-set-mining algorithm to the traces to determine a collection of frequent-action sets. Unfortunately, there is no explanation for the frequent-item-set-mining algorithm. If there is some explanation how the algorithm determines a collection of frequent-action-sets then it would be easier to understand how the frequent-action-sets are being used.
5. In the third article they showed some table which describes the pattern of some object’s behavior. However, they didn’t mention about the time interval of capturing the behavior. If we use various time intervals of capturing the object’s behavior, then every pattern will be different.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License