predicate logic
Predicate Logic
Logic can be defined as a scientific study
of the process of reasoning and the system of rules and procedures that help in
the reasoning process. Basically, the logic process takes in some information
(called premises) and produces some outputs (called Conclusions).
Logic is basically classified into two categories.
·
Propositional
logic
·
Predicate
logic
First
Order Predicate Logic (FOPL):
It was developed by logicians as a means
for formal reasoning primarily in the areas of mathematics. In FOPL statements
form a natural language like English are translated into symbolic structures
comprised of predicates, functions, variables, constants, quantifiers and
logical connectives. The symbols form the basic building blocks for the
knowledge and their combination into valid structures is accomplished using the
syntax for FOPL.
Propositional logic works fine in
situations where the result is either true or false but not both. However there
are many real life situations that cannot be treated this way. In order to
overcome this deficiency, first order logic or predicate logic uses three
additional notions. These are predicates, terms and quantifiers.
Predicates: a predicate is defined as a relation that
binds 2 atoms together.
Ex: Bhaskar likes Aeroplanes is represented as
Likes
(Bhaskar, aeroplanes)
Here likes is a predicate that likes 2 atoms Bhaskar
and aeroplanes.
The symbols and rules of combination of a predicate
logic are:
Predicate symbol:
Rama loves Sita.Love(Rama,Sita)
Here love is a predicate symbol.
Constant symbol: Here
constant symbols are Ram,Sita
Variable symbol: X
loves Y Here X&Y are variable
symbols.
Function symbol: These symbols are used to represent
special type of relationship or mapping.
It is also possible to have a function as an
arguement.
Ravi's father is Rani's father is represented as
father (father(ravi),rani)
Here Rani is a
predicate and father(Ravi) is a function.
Constant, variables and functions are referred to as
terms and predicates are referred as atomic formulas.
The statements in predicate logic are termed as well
formed formulas i.e WFF’s. A predicate with no variable is called a ground
atom.
Connectives: The formula in predicate calculus can be
combined to form more complex formula using several connectives. OR connective(
V), AND connective ( ^), Implies connective ( è) Negation connective(¬).
Quantifiers: A quantifier is a symbol that permits one
to declare or identify the range or scope of the variables in a logical
expression. There are 2 basic quantifiers used in logic. They are Universal
quantifiers (for all) and Existential quantifier(there exists).
Universal Quantifier: If “a” is variable then for all a is read as 1. for all a 2.for
each a 3.for every a.
Existential Quantifier: Similarly if “a” is a variable
then there exists a is read as 1.there
exists a 2. for some 3. for every a.
Variables: Free and bound variables: A variable in a
formula is free if and only if the occurrence is outside the scope of a
quantifier. A variable is also free in a formula if at least one occurrence of
it is free. A variable is bound if and only if its occurrence is within the
scope of the quantifier. A variable is also bound in situations where at least
one occurrence of it is bound.
Ex: For all x there exists y (A(x, y, z)) and for all
z (B(y, z))--------(1)
In this formula, the variable z is free in the first
portion for all x there exists y (A(x, y, z))
Ex: for all x (A(x)->B(x))
In this formula, the quantifier for all applies over
the entire formula.
(A(x)->B(x)) then the scope of the quantifier is
A(x)->B(x)
any change in the quantifier has an effect on both
A(x) and B(x).
hence, the variable x is bound .
Syntax for FOPL:
Connectives: there are 5 connective symbols .~,
&, or, -> ,<->.
Quantifiers: there are 2 quantifier symbols. for all
,there exists.
Constants: constants are fixed value term that belong
to a given domain of discourse.
Variables: variables are terms that can assume
different values over a given domain.
Functions: function symbols denote relations defined
on domains.
Predicates: predicate symbol denotes relations or
functional mappings from the elements of a domain D to the values true or
false. Capital letters and capitalized words are used to represent the
predicates. Constants, variables and functions are referred to as terms and
predicates are referred to as atomic formulas or atoms.
Symbols , left and right parenthesis, square brackets
braces and the periods are used for in
symbolic expressions.
Ex:1. all employees earning $1400 or more per year pay
taxes .
For all x(E(x)& GE(i(x),1400))->T(x))
2. some employees are sick today. There
exists (E(y)->delta(y)).
Semantics for FOPL:
WFF (Well Formed Formulas): A language such as predicate calculus is
defined by its syntax. To specify a syntax we must specify the alphabet of the
symbols to be used in the language of how these symbols are put together to
form legitimate (valid) expression in the language. The legitimate expression
of the predicate calculus is called WFF.
Properties of WFF:
(1).Equivalence: Not(Not X) is
equivalent X.
(2). Demorgans theorem: Not(x & y) = not x V not y.
(3).
Distributive property:
(4). Commulative property:
(5). Associative property.
(6). Constrapositive: xèy =not xènot y.
Clause form: A clause is defined as a WFF consisting of a
disjunction of literals. The resolution process when it is applicable is
applied to pair of parents and produces a new clause.
Conversion to clausal form: one method we shall examine is called
resolution of requires that all statements to be converted into a normalized
clausal form. We define a clause as the disjunction of a number of literal. A
ground clause is one in which no variables occur in the expression .A horn
clause is a clause with at most one positive literal.
To transform a sentence into clausal form requires the
following steps:
1. Eliminate all implication and equivalence symbols.
2. Move negation symbols into individual atoms.
3. Rename
variables if necessary so that all remaining quantifiers have different
variable assignments.
4. Replace existentially quantified variables with special
functions and eliminates the corresponding quantifiers.
5. Drop all universal quantifiers and put the
remaining expression info CNF(disjunctions are moved down to literals).
6. Drop all conjunction symbols writing each clause
previously converted by the conjunction on a separate line.
We describe the process of eliminating the existential
quantifiers through a substitution process. This process requires that all such
variables are replaced by something called skolem functions, arbitrary
functions which can always assume a correct value required of an existentially
quantified variable.
Example in Predicate logic: Suppose we know that “all Romans who know marcus either hate Caesar or think that any one who hates any one is crazy”.
"x: [Roman (x)Ùknow(x,marcus)] ® [hate(x,Caesar) Ú ("y: $z:hate(y,z) ®thinkcrazy(x,y))]
1. Eliminate implies symbol using
the fact that a ® b = ØaÚb.
"x: Ø [Roman (x)Ùknow(x,marcus)] Ú [hate(x,Caesar) Ú ("y: Ø( $z:hate(y,z) Ú thinkcrazy(x,y))]
2. Reduce the scope of
each Ø
to a single term.
"x: [Ø Roman (x) Ú Øknow(x,marcus)] Ú [hate(x,Caesar) Ú ("y: "z: Øhate(y,z) Ú thinkcrazy(x,y))]
\Ø(a Ùb) =ØaÚ Ø b. Ø$x:p(x) = "x: Ø p(x) .
3. Standardize variables so that each quantifier binds a unique variable. Since variables are
just dummy names, this process cannot
affect the truth value of the wff.
"x:P(x) Ú "x:Q(x)
= "x:P(x) Ú "y:Q(y)
4. Move all quantifiers to the left of the formula
with out changing their relative order.
"x: "y: "z: [Ø Roman (x) ÚØknow(x,marcus)] Ú [hate(x,Caesar) Ú Øhate(y,z) Ú thinkcrazy(x,y))]
5. Eliminate
existential quantifiers.
In our example,
there are no existential quantifiers at step 4. There is no need to eliminate
that quantifiers. If that quantifiers occurs then use skolem functions to
eliminate that quantifiers.
6. Drop the prefix.
From
(4).
[Ø Roman (x) ÚØknow(x,marcus)] Ú [hate(x,Caesar) Ú Øhate(y,z) Ú thinkcrazy(x,y))]
7. Convert the matrix into a conjunction of disjuncts. In our problem
that are no (ands) disjuncts. So use associative property.
(aÚb) Úc = aÚ(b Úc).
Ø Roman (x) ÚØknow(x,marcus) Ú hate(x,Caesar) Ú Øhate(y,z) Ú thinkcrazy(x,y).
Forward versus Backward
Reasoning:
The
object of a search procedure is to discover a path through a problem space from
an initial configuration to a goal state.
There are actually two directions in which such a search could proceed.
·
Forward,
from the start states
·
Backward,
from the goal states.
The production system model of the search process provides
an easy way of viewing forward and
backward reasoning as systematic processes.
Forward reasoning from
initial states:
1. Root: A tree of move sequences, that might
be solutions, is built by starting with the initial configuration at the root
of the tree.
2. Generate of the next level: The next level
of the tree is generated by building a tree by finding all the rules whose left
sides are matched against the current state(root node) and the right sides are
to generate new nodes by creating new configuration. Generate next level by
taking each node generated at the previous level and continue the above
reasoning forward till a configuration matches the goal state.
Back ward reasoning from the
goal states:
1. Root: A tree of move sequences, that might
be solutions, is built by starting with the goal configuration at the root of
the tree.
2. Generate the next level: The next level of
the tree is generated by finding all the rules whose right sides match the root
node. The left sides are used to use to generate the new nodes, representing
new goal states to achieve. Generate the next level of the tree by taking each
node at the previous level and continuing the above reasoning until a node that
matches the initial state is generated. It is often called goal-directed
reasoning.
Non-Monotonic Reasoning Systems
Monotonic NonMonotonic
1. It is complete with respect
to It is incomplete.
the domain of interest.
2. It is consistent. It is not consistent.
3.
New facts can be added only when New facts will contradict and invalidate they are consistent with the facts
the old knowledge
that have already been as sorted.
A monotonic reasoning system cannot work effectively
in real life environments because information available is always incomplete.
As process goes by situation change and so are the
solutions. Default assumptions are made in order to reduce the search time and for
quick arrival of solutions.
Basic concepts of non-monotonic
reasoning systems:
AI systems provide solutions for those problems whose
facts and rules of inference are explicitly stored in the database and
knowledge base. But as mentioned above the data and knowledge are incomplete in
nature and generally default assumptions are made.
Non-monotonic reasoning systems are more complex then
monotonic reasoning systems. Monotonic reasoning systems generally do not
provide facilities for altering facts, deleting rules it will have an adverse
effect on the reasoning process.
One of major systems that have been implemented using
non-monotonic reasoning system with dependency. Directed back tracking is the
Truth maintenance system of Doyle.
Dependency -directed back tracking helps to great deal
in nom monotonic reasoning systems. A monotonic system evades contradictions. A
contradiction occurs when the system finds that the new state discovered is
inconsistent with the existing ones.
TMS: TMS also known as belief revision and revision
maintenance are companion components to inference systems. The main job of the TMS is to maintain
consistency of the knowledge being used by the problem solves and not to
perform any inference functions. The TMS also gives the inference component the
latitude to perform non-monotonic inferences. When new discovers are made, more
recent information can displace previous conclusions that are no longer valid.
In this way the set of belief’s available to the problem solver will continue
to be current and consistent.
Diagram: Tell
![]() |


|
Architecture of the problem solver with a
TMS
The role played by the TMS as a part of the
problem solver. The inference engine solves domain problems based on its
current belief set, while the TMS maintains the currently active belief set. The
updating process is incremented. After each inference, information is exchanged
between the 2 components. The IE tells the TMS what the deductions has made. The
TMS in turn, asks questions about current beliefs and reasons for failures. It
maintains a consistent set of belief’s i.e. to work with even if now knowledge
is added & removed.
The TMS maintains complete records of reasons of
justifications for beliefs. Each proposition having at least 1 valid
justification and made a part of current belief set. Statements lacking
acceptable justifications are excluded from this set. When a contradiction is
discovered the statements responsible for the contradiction are identified
& an appropriate one is retraced. This in turn may result in other
retractions & additions. The procedure used to perform this process is
called dependency back tracking.
The TMS maintains records to reflect retractions and
additions & that will always know its current belief set. The records are
maintained in the form of dependency network. The node in the network
represents KB entries such as premises, conclusions, inference rules. Attached
to the nodes are justifications, which represent steps from which the node was
derived. Nodes in the belief set must have valid justifications. A premise is a
fundamental belief which is assumed to be always true. Premises need no
justifications. They form a base from which all other currently active nodes
can be explained in terms of valid justifications.
![]() |
![]() |
||||||
![]() |
![]() |
||||||
Premises Assumptions Datum Justification

(SL <inlist><out list>)
CD justifications are used less frequently than the SL’s.
They justify a node as a type of valid hypothetical argument.
(CD<consequent><in hypothesis><out
hypothesis>)
Bayes
theorem: Bayes theorem is
used for the computation of probabilities, this theorem provides a method for
reasoning about partial beliefs. Here every event that is happening or likely
to happen is quantified dictate how these numerical values are to be
calculated. This method introduced by Clergyman Thomas bates in the 18 century.
This form of reasoning depends on the use of conditional probabilities of
specified events when it is known that other events have occurred. for 2 events
H & E with the probability p(E)>0 the condition probability of events H,
given event E has occurred is defined as .
P (H/E)=P (H&E)/P (E).--------(1)
Read this expression as the probability of hypothesis
H given that we have observed evidence.
The conditional probability of event E given that
event H occurred can likewise be written as
P(E/H)=P(H&E)/P(H)----------(2)
From Eq1 &
Eq 2 We get P(H/E)=P(E/H)*P(H) / P(E)----------------(3)
This equation expresses the notion that probability of
event H occurring when it is known that event E occurred is the same as the
probability that E occurs when it is known that H occurred, multiplied by the
ratio of the probabilities of the 2 events H, E occurring.
The probability of an arbitrary event B can always be
expressed as
P (B)=P (B&A)+P (B&~A) = P (B/A)/P(A) + P
(B/~A) P (~A)
Using this result 3 we can be written as P (H/E)=P
(E/H) P (H) / P (E/H) P (H)+P (E/~H) P (~H)
The above equation can be generalized for an arbitrary
no. of hypothesis Hi i=1,2……k thus
suppose the Hi partition the universe, i.e. the Hi are naturally exclusive. Then
for any evidence E1
We have P (E)=åk P (E &
Hi)= åk P (E/Hi) P (Hi)
I=1 I=1
And hence
P(Hi/E)=P(E/Hi)P(Hi)/ åk P(E/Hj)P(Hj)
J=1
Where P(Hi/E)= the probability that hypothesis Hi is
true given evidence E.
P(E/Hi)=
the probability that we will observe evidence E given that hypothesis I is
true.
P(Hi) =
the a priori probability that hypothesis I is true in the absence of any specific evidence. These probabilities are
called prior probabilities of priors.
K = the
number of possible hypothesis.
Disadvantages:
(1)
The
knowledge acquisition problem is insurmountable; too many probabilities are to
be provided.
(2)
The
space required to store all the probabilities is large.
(3)
The
time required to compute the probabilities is large.
Problem: consider an bulb-manufacturing unit. Here machines
M1,M2,M3 make 30%,30%,40% of the total bulbs of their o/p. Let’s assume that
2%,3%,4% are defective. A bulb is drawn at random and is found defective. What
is the probability that the bulb is made by machine M1, M2, and M3.
Solution: let E1, E2, E3 be the events that a bulb
selected at random is made by M1, M2, M3 .let Q denote that it is defective
prob(E1)=0.3,prob(E2)=0.3 and prob(E3)=0.4 given data .
These represent the prior probabilities
Prob of drawing a defective bulb made by
M1=prob(Q/E1)=0.02
Prob of drawing a defective bulb made by
M2=prob(Q/E2)=0.03
Prob of drawing a defective bulb made by
M3=prob(Q/E3)=0.04
These values are the posterior prob’s
Prob(E1/Q)=prob(E1)*prob(Q/E1)/summation from i=1 to
3 prob(Ei)*prob(Q/Ei)
=(0.3)*(0.02/(0.03*0.2)+(o.o3*0.3)+(0.04*0.4)
=0.1935
similarly prob
(E2/Q)=0.3*0.03/(0.03*0.2)+(0.03*0.3)+(0.04*0.4)
=0.2903
prob
E3/Q=(1-(prob(E1/Q)+prob(E2/Q))
=0.5162.
BAYESIAN NETWORKS: Network representations of knowledge have
been used to graphically exhibit the interdependencies, which exist between
related pieces of knowledge. Much work has been done in this area to develop a
formal syntax and semantics for such representations. Network representations
for uncertain dependencies are further motivated by observations made earlier.
If we wish to represent uncertain knowledge related to a set of propositional
variables x1……xn by their joint distribution P(x1…….xn) It will require some 2n entries to store the distribution explicitly.
Furthermore, a determination of any one of the marginal probabilities xi
requires summing p(x1……xn)over the remaining n-1 variables. Clearly the time
and storage requirements for such computations quickly become impractical
inferring with such large no.s of prob’s
does not appear to model the human process either. on the contrary, humans tend
to single out only a few propositions which are known to be casually linked
when reasoning with certain beliefs. This metaphor leads quite naturally be a
form of network representations.
To represent casual relationships between the
propositional variables x1…..x6 one can write the joint probability p(x1……x6)
by inspecting as a product of conditional prob’s.
P(x1,……x5)=p(x5/x2,x3)p(x4/x1,x2)p(x5/x1))p(x2/x1)p(x1)
Diagram:






X4 X5
Ex2:
A Bayes network is directed a cyclic graph whose nodes
are labeled by random variable. Bayes network are sometimes called causal
networks because the areas connecting the nodes can be thought of as
representing causal relationship.
To
construct a Bayesian network for a given set of variable, we draw arcs from
cause variable to immediate effects. We preserve the formalism and rely on the
modularity of the world. We are trying to model.
Consider an example:
S: Sprinkler was on last night
W: Grass is wet
R: It rained last night
We can write MYCIN style rules that described
predictive relationships among these three events.
IF: The sprinkler was on last night then there is
evidence that the grass will be wet this morning.
![]() |
Taken alone, this rule may accurately describe the
world. But consider a second rule:
IF: the grass is wet this morning then there is
evidence that it rained last night.
Taken alone, this rule makes sense when rain is the
most common source of water on the
grass.

![]() |
There are two different ways that propositions can
influence the likelihood of each other. The first is that causes influence the
likelihood of their symptoms; the second is that observing a symptom affects the
likelihood of all of its possible causes. The basic idea behind the Bayesian
network structure is to make a clear distinction between these two kinds of
influence.
DEMPSTER
SHAFER THEORY : The Bayesian approach depends on the use of known
prior and likely prob’s to compute conditional prob’s .The dempster Shafer
approach on the other hand is a generalization of classical prob theory which
permits the assignment of prob’masses (beliefs) to all subsets of the universe and not just to the basic
element.
A generalization theory has
been proposed by Arthur Dempster 1968 and extended by his student Glenn shafer
(1976). It has come to be known as the Dempster theory of evidence. The theory
is based on the notion that separate prob masses may be assigned to all subsets
of a universe of discourse rather than just to indivisible single members are
required in traditional prob theory.
In the Dempster, we assume a universe of
discourse m and a
set correspond to n propositions exactly one of which is true . The
propositions are assumed to be exhaustive and mutually exclusive let 2 m denote all subsets of U including the
empty set and u itself
M:2 pow u->[0,1]
åm m(A)=1.
AÍm
The function m defines a
prob distribution on 2 pow u it represents the measure of belief committed
exactly to A .a belief function, Bel corresponding to specific m for the set A
is defined as the sum of beliefs committed to every subset of A by m. Bel(A) is
a measure of the total support or belief committed to the set A and sets a
minimum value for its likelihood .
Bel(A)=summation over B
subset of a m (B).
The Dempster ,a belief
interval can also be defined for a subset A .It is represented as the sub
internal [BEL(a),p(a)]OF [0,1].Bel(A) is also called the support of A and p(A)=1-Bel(~A) the
plausibility of (A).when evidence is
available from two or more independent
knowledge sources Bel 1,BEL 2 one would like to pool the evidence to reduce the
uncertainty .for this dempster has provided such has combining function
denoted as bel1 o bel 2 the total prob
mass committed to C.
C= summation over Ai and Bj
m1(Ai)m2(Bj)
The sum in above equation
must be normalized to account for the fact that some intersections Ai and Bj
=pie will have positive prob which must be discarded .the final form of
dempster of combination is then given by
m1om2=summation over Ai and
Bj =C mi(Ai)m2(Bj)/summation over Ai and
Bj not to 0m1(Ai)m2(Bj)
where the summations are taken overall i
and j.
Reasoning using certainty
factors: Probability
based reasoning adopted bayes theorem for handling uncertainty, unfortunately,
to apply bayes theorem one needs to estimate a priori and conditional
probableties which are difficult to be calculated in many domains. Hence, to circumvent this problem, the
developers of MYCIN system adopted certainty factors.
A certainty factor (CF) is a numerical
estimate of the belief or disbelief on a conclusion in the presence of a set of evidence. Various methods of using CF’s have been
adopted, Typical of them are as under.
1. Use
a scale from 0 to 1 where 0 represents total disbelief and 1 stands for total
belief. Other values between
0 to 1 represent varying degrees of belief and disbelief.
2. MYCIN’s CF representation is one scale from
–1 to +1. The value of 0 stands for unknown.
In Expert systems, every production rule has a certainty factor associated
with it. The values of the CF are determined by the domain expert who
creates the knowledge base.
Matching
Matching is a basic function that is
required in almost all A.I programs. It is an essential part of more complex
operations such as search and control. In many programs it is known that
matching consumes a large fraction of the processing time.
Matching is the process of comparing two or
more structures to discover their likeness or differences. The structures may
represent a wide range of objects including physical entities, words or phrases
in some language, complete classes of things, general concepts, relations
between complex entities and the like. The representations will be given in one
or more of the formalisms like FOPL, networks or some other scheme and matching
will involve comparing the component parts of such structures.
Matching
is used in a variety of programs for different reasons. It may serve to control
the sequence of operations, to identify or classify objects, to determine the
best of a number of different alternatives or to retrieve items from a
database. It is an essential operation in such diverse programs as speech
recognition, natural language understanding, vision, learning, automated
reasoning, planning, automatic programming and expert system as well as many
others.
Matching is just
process of comparing two structures or patterns for equality. The match fails
if the patterns differ in any aspect.
Different types of matching are:
(1). Indexing
(2). Matching with
variables.
(3). Complex and
Approximate Matching
(4). Conflict
resolution.
(1)Indexing: In
case of indexing we use the current state as an index into the rules in order
to select the matching ones. Consider
chess. Here we assign a number to each
board position. Then we use a hosting
function to treat the number as an index into the rules.
(2)Matching
with variables: In this case we try to match many rules
against many elements in the state description simultaneously. One efficient algorithm is RETE, which gains
efficiency from 3 major sources these sources are:
(a)
The
temporal natural of the date. Any rule
should into make changes radically but must add one or two elements, or delete
one or two.
(b)
Structural
similarity in rules. There may be a
situation where some rules may or two conditions other conditions being the
same.
For Example:
i.
son
(x,y) ^ son (y,z) -> grandparent
(x,z)
ii.
Son
(mary,joe)
iii.
Son
(Bill, Bob)
Here the rules (i) & (ii) can be matched but must
satisfy the values of y.
(3)Complex & Approximate Matching: An approximate matching is one, which is
used when the preconditions approximately match the current situation.
Consider an example of a dialogue between ELIZA &
a user. Here ELIZA ill try to match the
left side of the rule again the users last sentence and use the correct right
side to generate a response. Let us consider
the following ELIZA rules:
(X me
Y) -> (X you Y)
(I
remember X) -> (who do remember X just now?)
(My
{Family-member} is Y) -> (Who else in your family is Y?)
Suppose the use says, “ I remember Mary” How ELIZA
will try to match the above response to the left hand side of the given
rules. It finds that it matches to the
first rule & now it takes the right hand side and asks “ why do remember
Mary just now?”
This is how the conversation precedes taking into
consideration the approximate matching.
(4)Conflict Resolution: Conflict Resolution is a strategy in which
we incorporate the decision making into the matching process. We have three basic approaches.
·
Preference
based on Rules
·
Preference
based on Objects
·
Preference
based on states.
(a) Preference Based on Rules: Here we consider the rules in the order
they are given or we give some priority to special case rules. There are two ways in which a matcher will
try to decide how one rule is more general than the others. This allows us in decreasing the search size
by more general rules. The two ways are:
1. If one rule contains all the preconditions that
another rule has and some additional then Second rule is more general they the
first.
2. If one rule contains preconditions with variables
and the other contains the same.
A precondition
with constants then the first rule is more general.
(b)Preference based on
Object: Here we use keywords to match into the rules consider the example of
ELIZA. It takes specific keywords form
the user’s response and tries to match the keywords with the given rules. Like
previously it uses the “remember” keyword to match the L.H.S. rule.
(c) Preference Based on States: In this
case we fire all the rules that are waiting they lead us to some states. Using a heuristic function we can decide
which state is the best.
Partial matching: For many AI applications complete
matching between two or more structures is in appropriate. For example, input
representations of speech waveforms or visual scenes may have been corrupted by
noise or other unwanted distortions. In such cases, we do not want to reject
the input out of hand. Our systems should be more tolerant of such commonly
occurring problems. Instead, we want our systems to be able to find an acceptable
or best match between the input and some reference description.
The RETE Matching algorithm:
A typical system will contain a Knowledge Base which
contains structures representing the domain expert’s knowledge in the form of
rules or productions, a working memory which holds parameters for the current
problem, and an inference engine with rule interpreter which determines which
rules are applicable for the current problem.
The basic inference cycle of a production system is
match, select, and execute as indicated in figure. These operations are performed as follows:
|



|
|


![]() |
|||||
![]() |
|||||
![]() |
|||||


![]() |
|||||
|
|
||||
|


Match: During the match portion of the cycle, the
conditions in the left hand side(LHS) of the rules in the knowledge base are
matched against the contents of working memory to determine which rules have
their LHS conditions satisfied with
consistent bindings to working memory terms.
Rules which are found to be applicable (that match) are put in a conflict set.
Select: From
the conflict set, one of the rules is
selected to execute. The selection
strategy may depend on regency of usage, specificity of the rule, or other
criteria.
Execute: The rule selected from the conflict set is
executed by carrying out the action or conclusion part of the rule, the right
hand side (RHS) of the rule. This may involve an I/O operation, adding,
removing or changing clauses in working Memory or simply causing a halt.
The above cycle is repeated until no rules are put in
the conflict set or until a stopping condition is reached.
A
typical knowledge base will contain hundreds or even thousands of rules and
each rule will contain several (perhaps as many as ten or more) conditions.
Working memories typically contain hundreds of clauses as well. Consequently, exhaustive matching of all
rules and their LHS conditions against working-memory clauses may require tens
of thousands of comparisons. This
accounts for the claim made in the introductory paragraph that as much as
90& of the computing time for such systems can be related to matching
operations.
To eliminate the need to perform thousands
of matches per cycle, an efficient match algorithm called RETE has been
developed (Forgy, 1982). It was
initially developed as part of the OPS family of programming languages
(Brownston, et al., 1985). This algorithm
uses several novel features, including methods to avoid repetitive matching on
successive cycles. The main timesaving
features of RETE are as follows.
Advantages:
a. In most expert systems, the contents of
working memory change very little from cycle to cycle.
b. Many rules in a knowledge base will have
the same conditions occurring in their LHS.
c. The temporal nature of data.
d. Structural similarity in rules.
e. Persistence of variable binding
consistency.
Comments
Post a Comment