Exploring the Microsoft IE OLE interface

In this article we use ACL to explore the Microsoft Internet Explorer's OLE interface. We'll use two tools, the ole:def-ole-linkage system and the more flexible, but lower-level, remote-autotool system. To duplicate the actions here you'll need Allegro CL 8.0 running on Microsoft Windows with the latest updates for Allegro downloaded and installed, Microsoft Internet Explorer, and an internet connection that can reach google.com.

We'll work in a non-IDE ACL to keep the displays simple. We'll use mlisp rather than alisp so the case aspects are visible. The same code will work in an alisp.

To get started we do this:

cl-user(1): (require :ole-dev)
; Fast loading c:\acl80\src\cl\src\ole\ole-dev.fasl
;   Fast loading c:\acl80\src\cl\src\ole\ole.fasl
;;; Installing ole patch, version 2
;     Fast loading c:\acl80\src\cl\src\code\ffcompat.fasl
;     Fast loading c:\acl80\src\cl\src\ole\client\imalloc.fasl
;       Fast loading c:\acl80\src\cl\src\ole\client\iunknown.fasl
;   Fast loading c:\acl80\src\cl\src\ole\olecomp.fasl
;;; Installing olecomp patch, version 2
t
cl-user(2): :ld sys:ole;client;autotool.fasl
; Fast loading c:\acl80\src\cl\src\ole\client\autotool.fasl
;   Fast loading c:\acl80\src\cl\src\ole\client\idispatch.fasl
;   Fast loading c:\acl80\src\cl\src\ole\client\iclassfactory.fasl
cl-user(3): 
The first form is to require the main OLE development environment. The second form is to load the adaptive tool for OLE interfaces.

We plan to have the ole:def-ole-linkage subsystem build a Lisp interface to Internet Explorer. To do that we need some way of finding a type library that describes the interface. We'll use some of the OLE machinery to help locate one. We'll take a wild guess first because even if we guess wrong, it will make Lisp load in the support functions we'll be needing.

cl-user(3): (ole:def-ole-linkage #:msx :application "Internet Explorer")
; Fast loading c:\acl80\src\cl\src\ole\client\usetypelib.fasl
;;; Installing usetypelib patch, version 1
;   Fast loading c:\acl80\src\cl\src\ole\client\itypelib.fasl
;     Fast loading c:\acl80\src\cl\src\ole\client\itypeinfo.fasl
; Foreign loading ole32.dll.
; Foreign loading advapi32.dll.
; Foreign loading oleaut32.dll.
Error: Could not load typelib

Restart actions (select using :continue):
 0: Return to Top Level (an "abort" restart).
 1: Abort entirely from this (lisp) process.
[1] cl-user(4): 

Next, we want to search the registry for an application that has "Internet Explorer" in its name and build the corresponding Lisp classes and methods in the "msx" package. As you can see, above, Lisp complains that it cannot load a type library. We're pretty sure the program is there, so we have the name wrong. We'll recover from this error and use an exploratory function to investigate:

[1] cl-user(4): :pop
cl-user(5): (ole:application-classid "Explorer")
Error: ambiguous matches:
       (InternetExplorer.Application InternetExplorer.Application.1
        Shell.Explorer Shell.Explorer.1 Shell.Explorer.2)

Restart actions (select using :continue):
 0: Return to Top Level (an "abort" restart).
 1: Abort entirely from this (lisp) process.
[1] cl-user(6):
ole:application-classid searches for "Explorer" in the registry and does one of three things. If there are no applications with "Explorer" in the name, it returns nil. If there is just one, it returns the GUID of that application. If there is more than one, it signals an error and lists the names it found. In our case, we see several names; some start with "InternetExplorer" and the others with "Shell.Explorer".

We pop out of that error and try a variant to see if "InternetExplorer.Application" and "InternetExplorer.Application.1" name the same application:

[1] cl-user(6): :pop
cl-user(7): (ole:application-classid "InternetExplorer")
#.(ole:unique-guid "{0002df01-0000-0000-c000-000000000046}")
1
0
cl-user(8):
Here the function has returned a GUID object, so we know we have an unambiguous designator. We can now pull in the typelib and build the corresponding lisp functions, which will load a lot of support files and define a lot of classes, functions and methods:
cl-user(8): (ole:def-ole-linkage #:msx :application "InternetExplorer")
; Fast loading c:\acl80\src\cl\src\ole\client\container.fasl
;   Fast loading c:\acl80\src\cl\src\winapi\winapi.fasl
;   Fast loading c:\acl80\src\cl\src\ole\olewin.fasl
;   Fast loading c:\acl80\src\cl\src\ole\client\iviewobject2.fasl
;     Fast loading c:\acl80\src\cl\src\ole\client\iviewobject.fasl
;   Fast loading c:\acl80\src\cl\src\ole\client\ioleobject.fasl
;   Fast loading c:\acl80\src\cl\src\ole\client\ioleinplaceobject.fasl
;     Fast loading c:\acl80\src\cl\src\ole\client\iolewindow.fasl
;   Fast loading c:\acl80\src\cl\src\ole\client\ioleclientsite.fasl
;   Fast loading
;      c:\acl80\src\cl\src\ole\client\ioleinplaceactiveobject.fasl
;   Fast loading c:\acl80\src\cl\src\ole\client\ioleinplaceuiwindow.fasl
;   Fast loading c:\acl80\src\cl\src\ole\client\ioleinplaceframe.fasl
;   Fast loading c:\acl80\src\cl\src\ole\client\ipersiststorage.fasl
;     Fast loading c:\acl80\src\cl\src\ole\client\ipersist.fasl
;   Fast loading c:\acl80\src\cl\src\ole\client\irunnableobject.fasl
;   Fast loading c:\acl80\src\cl\src\ole\client\istorage.fasl
;   Fast loading c:\acl80\src\cl\src\ole\client\istream.fasl
;   Fast loading c:\acl80\src\cl\src\ole\server\ioleclientsite.fasl
;     Fast loading c:\acl80\src\cl\src\ole\server\iunknown.fasl
;   Fast loading c:\acl80\src\cl\src\ole\server\ioleinplacesite.fasl
;     Fast loading c:\acl80\src\cl\src\ole\server\iolewindow.fasl
;   Fast loading c:\acl80\src\cl\src\ole\server\iadvisesink.fasl
;   Fast loading c:\acl80\src\cl\src\ole\server\iolecontrolsite.fasl
;   Fast loading c:\acl80\src\cl\src\ole\server\ipropertynotifysink.fasl
;   Fast loading c:\acl80\src\cl\src\ole\client\ienumoleverb.fasl
;     Fast loading c:\acl80\src\cl\src\ole\client\ienumxxx.fasl
;   Fast loading c:\acl80\src\cl\src\ole\server\ioleinplaceframe.fasl
;     Fast loading
;        c:\acl80\src\cl\src\ole\server\ioleinplaceuiwindow.fasl
;   Fast loading c:\acl80\src\cl\src\ole\server\idispatch.fasl
; Fast loading c:\acl80\src\cl\src\ole\client\control.fasl
;;; Installing control patch, version 1
;   Fast loading c:\acl80\src\cl\src\ole\client\iconnectionpoint.fasl
;   Fast loading
;      c:\acl80\src\cl\src\ole\client\iconnectionpointcontainer.fasl
; Fast loading c:\acl80\src\cl\src\ole\defifc\idispatch.fasl
;   Fast loading c:\acl80\src\cl\src\ole\defifc\iunknown.fasl
"SHDocVw"
cl-user(9):

To see just what has just been done you can do this:

(dribble "msxdefs.cl")
(pprint (macroexpand
 '(ole:def-ole-linkage #:msx :application "InternetExplorer")))
(dribble)
We capture the output in a dribble file because it's so large; most of it is lost off the top of the console window's scroll buffer. To see it click here.

Rather than pore through the mass of text, we'll look in the msx package to see what classes it defines:

cl-user(9): (do-external-symbols (s :msx) (when (find-class s nil) (print s)))

msx:ShellWindows 
msx:InternetExplorer 
msx:ShellNameSpace 
msx:WebBrowser_V1 
msx:CScriptErrorList 
msx:SearchAssistantOC 
msx:WebBrowser 
msx:ShellUIHelper 
msx:ShellShellNameSpace 
msx:ShellBrowserWindow 
msx:ShellSearchAssistantOC 
nil
cl-user(10): 

Of these, InternetExplorer, WebBrowser_V1, and WebBrowser seem like potential controllers. We'll try the first and see what happens:

cl-user(10): (setq browser (make-instance 'msx:InternetExplorer))
#<msx:InternetExplorer @ #x2077cbda>
cl-user(11): (ole:connect-to-server browser :inplace nil)
t
cl-user(12): 

The second form executed without error, so we should have a browser running. It doesn't appear on the screen, but controls often have a visible attribute. We make a quick plausibility check:

cl-user(12): (apropos :visible :msx)
msx:OLECMDF_INVISIBLE value: 16
msx:Visible         [generic-function] (ole::this.object)
msx:-OnVisible      [generic-function] (ole::this.control
                                        ole::this.channel
                                        ole.control::Visible)
cl-user(13): 

This is promising. Let's try some other things:

cl-user(13): (msx:Visible browser)
nil
cl-user(14): (setf (msx:Visible browser) t)
t
cl-user(15): 
Voila! We see a web browser window appear. (It may be obscured by the Lisp console window, so look around for it.)

Now we'd like to manipulate this browser. We take a look at the functions that were defined by the def-ole-interface.

First we just collect them:

cl-user(15): (loop for s being each external-symbol in :msx
                   when (fboundp s) collect s)
(msx:-PrintTemplateInstantiation msx:Top msx:-NewWindow3 msx:Offline
  msx:DeleteSubscriptionForSelection msx:Left
  msx:decode-ShellWindowTypeConstants msx:EnumOptions
  msx:-BeforeNavigate msx:ImportExportFavorites ...)
cl-user(16): 
but this produces a long list. It will be easier to scan them if they are sorted alphabetically:
(sort
  *
  #'(lambda (s1 s2) (string-lessp (symbol-name s1) (symbol-name s2))))
And easier still if they get printed one to a line:
(dolist (s *) (print s))
This produces an output in the Lisp console window, and one interesting section is:
...
msx:MoveSelectionUp 
msx:Name 
msx:Navigate 
msx:Navigate2 
msx:NavigateAndFind 
msx:NavigateToDefaultSearch 
msx:NETDetectNextNavigate 
...

Navigate sounds useful. Let's take a closer look like this:

cl-user(18): (describe 'msx:Navigate)
msx:Navigate is a tenured symbol.
  It is unbound.
  It is external in the msx package.
  Its function binding is #<standard-generic-function msx:Navigate>
    which function takes arguments
    (this.object URL &key (Flags missing) (TargetFrameName missing)
     (PostData missing) (Headers missing))
  Its property list has these indicator/value pairs:
#<msx:control-set-description>  t
cl-user(19): 

That looks like it might be just what we want. Let's give it a try with

(msx:Navigate browser "www.google.com")
This returns nil very quickly, and within a few seconds, depending on your system and internet connection, the browser window will be displaying the Google search page.

We can do more, though. Try entering a search string into the browser, using it independently from the Lisp that started it. Type in the word 'lisp' and hit enter. If we do that, we can see this in the browser's address window:

http://www.google.com/search?hl=en&q=lisp
We could build on the msx:Navigate function to do programmatic searches, like this:
(defun gsearch (item)
  (msx:Navigate browser
     (format nil "http://www.google.com/search?hl=eng&q=~a" item)))

Now try

(gsearch "fortran")
All right, that's neat, but starting a search from Lisp would be a lot more interesting if we could read the results in Lisp. How can we get at the contents of that web page? Scanning that list of functions we made earlier, we don't see anything with a name like 'Contents', but we do see something in this section:
msx:DeleteSubscriptionForSelection 
msx:Depth 
msx:Document 
msx:EncodeString 
msx:EnumOptions 
msx:ExecWB 
The Document function might return us the 'document' displayed in the browser. We give it a try and see this:
cl-user(22): (msx:Document browser)
#<IDispatch-client:6fe6c>
cl-user(23): 

Whatever this document is, it supports an IDispatch interface. We'll save the interface in a variable (i.doc) and see what we can learn about it.

cl-user(23): (setq i.doc *)
#<IDispatch-client:6fe6c>
cl-user(24): (ole:get-type-info-count i.doc)
1
cl-user(25): (ole:get-type-info i.doc 0)
#<ITypeInfo-client:69794>
cl-user(26): 

get-type-info-count says there is one typeinfo item available, so we use get-type-info to grab it. Since it's an OLE interface, we save it in the variable i.ti.doc (mnemonically 'interface of typeinfo of document'). Then, we can get Lisp to read in the information from that typeinfo object. It will be a structure and we'll save it in the variable ta.doc:

cl-user(26): (setq i.ti.doc *)
#<ITypeInfo-client:69794>
cl-user(27): (setq ta.doc (ole:get-type-attr i.ti.doc))
#<dispatch lisp-typeinfo DispHTMLDocument @ #x209b8702>
cl-user(28):

The elements of this structure record what lisp knows about the interface. We can see it's a DispHTMLDocument OLE interface, but there's more information buried inside the structure. There's no lisp function to display that neatly, but we won't let that stop us. We'll use the lisp inspector to poke around.

cl-user(28): (inspect ta.doc)
A new ole:lisp-typeinfo struct @ #x209b8702 = <...>
   0 Class --------> #<structure-class ole:lisp-typeinfo>
   1 name ---------> A simple-string (16) "DispHTMLDocument"
   2 interface-flags -> The symbol nil
   3 guid ---------> ole:lisp-guid struct = <...>
   4 lcid ---------> fixnum 0 [#x00000000]
   5 helpstring ---> The symbol nil
   6 helpcontext --> fixnum 0 [#x00000000]
   7 helpfile -----> The symbol nil
   8 memidConstructor -> A bignum = 4294967295
   9 memidDestructor -> A bignum = 4294967295
  10 cbSizeInstance -> fixnum 4 [#x00000010]
  11 typekind -----> fixnum 4 [#x00000010]
  12 kind ---------> The symbol :dispatch
  13 cFuncs -------> fixnum 203 [#x0000032c]
  14 functions ----> (# # # # # # # # # # ...), a proper list with 203 elements
  15 cVars --------> fixnum 0 [#x00000000]
  16 variables ----> The symbol nil
  17 cImplTypes ---> fixnum 1 [#x00000004]
  18 interfaces ---> ("IDispatch"), a proper list with 1 element
  19 cbSizeVft ----> fixnum 28 [#x00000070]
  20 cbAlignment --> fixnum 4 [#x00000010]
  21 wTypeFlags ---> fixnum 4112 [#x00004040]
  22 wMajorVerNum -> fixnum 0 [#x00000000]
  23 wMinorVerNum -> fixnum 0 [#x00000000]
  24 tdescAlias ---> The symbol nil
   ...
  30 base-ct ------> fixnum 0 [#x00000000]
[1i] cl-user(29): 

We see a member named 'functions' with 203 elements. It looks like this interface supports a lot of functionality. Inspecting this slot, we see it's a list of lisp-funcdesc structures.

[1i] cl-user(29): :i functions
A new proper list @ #x20b76071 with 203 elements
   0-> ole:lisp-funcdesc struct = <...>
   1-> ole:lisp-funcdesc struct = <...>
   2-> ole:lisp-funcdesc struct = <...>
   3-> ole:lisp-funcdesc struct = <...>
   4-> ole:lisp-funcdesc struct = <...>
   5-> ole:lisp-funcdesc struct = <...>
   6-> ole:lisp-funcdesc struct = <...>
   7-> ole:lisp-funcdesc struct = <...>
   8-> ole:lisp-funcdesc struct = <...>
   9-> ole:lisp-funcdesc struct = <...>
  10-> ole:lisp-funcdesc struct = <...>
  11-> ole:lisp-funcdesc struct = <...>
  12-> ole:lisp-funcdesc struct = <...>
  13-> ole:lisp-funcdesc struct = <...>
  14-> ole:lisp-funcdesc struct = <...>
  15-> ole:lisp-funcdesc struct = <...>
  16-> ole:lisp-funcdesc struct = <...>
  17-> ole:lisp-funcdesc struct = <...>
  18-> ole:lisp-funcdesc struct = <...>
  19-> ole:lisp-funcdesc struct = <...>
  20-> ole:lisp-funcdesc struct = <...>
  21-> ole:lisp-funcdesc struct = <...>
  22-> ole:lisp-funcdesc struct = <...>
  23-> ole:lisp-funcdesc struct = <...>
  24-> ole:lisp-funcdesc struct = <...>
   ...
 202-> ole:lisp-funcdesc struct = <...>
[1i] cl-user(30): 

The inspector always saves the last thing displayed as the value of *, so we can easily get a list of the function names like this:

[1i] cl-user(30): (mapcar 'ole:lisp-funcdesc-name *)
("Script" "all" "body" "activeElement" "images" "applets" "links"
 "forms" "anchors" "title" ...)
[1i] cl-user(31): (sort * 'string-lessp)
("activeElement" "alinkColor" "alinkColor" "all" "anchors"
 "appendChild" "applets" "attachEvent" "attributes" "baseUrl" ...)
[1i] cl-user(32): 
Seeing that it was a list of strings, we sorted it.

And then display it one per line:

cl-user(33): (dolist (s *) (print s))
...
"baseUrl" 
"bgColor" 
"bgColor" 
"body" 
"charset" 
"charset" 
"childNodes" 
...
Scanning the resulting list, we don't see anything that's obviously the contents of the document, but we do see a "Body" function.

Relying on Lisp's ability to recover from errors, we'll just try calling the "Body" method. After all, there's no reason to expect more than one body, so no need for any arguments besides the document itself. We do have a problem here, though. The value of i.doc is just an IDispatch interface. Making Dispatch method calls is a complicated business with a lot of details. The ole:def-ole-linkage system handles that for us, hiding the grubby details inside all its generated code, but it didn't provide definitions for this kind of interface. Fortunately, ACL's autotool subsystem can also manage the messiness, and without requiring a type library first. It's a little more primitive than the neat functions ole:def-ole-linkage gives us. First we exit the inspector, then ask for an autotool wrapped around our document's IDispatch interface.

[1i] cl-user(33): :i q
#<dispatch lisp-typeinfo DispHTMLDocument @ #x209b8702>
cl-user(34): (setq tool.doc
  (make-instance 'ole:remote-autotool :dispatch i.doc))
#<ole:remote-autotool @ #x20a791ea>
cl-user(35):
This will let us ask for the body attribute:
cl-user(37): (ole:auto-getf tool.doc :body)
#<IDispatch-client:69e44>
cl-user(38):

We get yet another IDispatch interface, which is no real surprise. An html page has a lot of structure, and the interfaces reflect this. Repeating our probe, this time on the new interface:

cl-user(38): (setq i.body *)
#<IDispatch-client:69e44>
cl-user(39): (ole:get-type-info-count i.body)
1
cl-user(40): (setq i.ti.body (ole:get-type-info i.body 0))
#<ITypeInfo-client:6a0f4>
cl-user(41): (setq ta.body (ole:get-type-attr i.ti.body))
#<dispatch lisp-typeinfo DispHTMLBody @ #x20aade92>
cl-user(42): 

Inspecting this structure, a description of a DispHTMLBody interface, we find 301 functions.

cl-user(42): (inspect ta.body)
A new ole:lisp-typeinfo struct @ #x20aade92 = <...>
   0 Class --------> #<structure-class ole:lisp-typeinfo>
   1 name ---------> A simple-string (12) "DispHTMLBody"
   2 interface-flags -> The symbol nil
   3 guid ---------> ole:lisp-guid struct = <...>
   4 lcid ---------> fixnum 0 [#x00000000]
   5 helpstring ---> The symbol nil
   6 helpcontext --> fixnum 0 [#x00000000]
   7 helpfile -----> The symbol nil
   8 memidConstructor -> A bignum = 4294967295
   9 memidDestructor -> A bignum = 4294967295
  10 cbSizeInstance -> fixnum 4 [#x00000010]
  11 typekind -----> fixnum 4 [#x00000010]
  12 kind ---------> The symbol :dispatch
  13 cFuncs -------> fixnum 301 [#x000004b4]
  14 functions ----> (# # # # # # # # # # ...), a proper list with 301 elements
  15 cVars --------> fixnum 0 [#x00000000]
  16 variables ----> The symbol nil
  17 cImplTypes ---> fixnum 1 [#x00000004]
  18 interfaces ---> ("IDispatch"), a proper list with 1 element
  19 cbSizeVft ----> fixnum 28 [#x00000070]
  20 cbAlignment --> fixnum 4 [#x00000010]
  21 wTypeFlags ---> fixnum 4112 [#x00004040]
  22 wMajorVerNum -> fixnum 0 [#x00000000]
  23 wMinorVerNum -> fixnum 0 [#x00000000]
  24 tdescAlias ---> The symbol nil
   ...
  30 base-ct ------> fixnum 0 [#x00000000]
[1i] cl-user(43): 
Extracting, sorting and printing the function names, as before,
:i q
(mapcar 'ole:lisp-funcdesc-name (ole:lisp-typeinfo-functions ta.body))
(sort * 'string-lessp)
(dolist (s *) (print s))
we can scan through the list of items and find a likely-looking entry: "innerHTML". We can wrap an autotool around the i.body interface and give it a try:
cl-user(47): (setq tool.body
  (make-instance 'ole:remote-autotool :dispatch i.body))
#<ole:remote-autotool @ #x2083a472>
cl-user(48): (ole:auto-getf tool.body :innerHTML)
"<TABLE cellSpacing=0 cellPadding=0 width=\"100%\" border=0>
<TBODY>
<TR>
<TD noWrap align=right>
<FONT size=-1>
<A href=\"https://www.google.com/accounts/Login?continue=
http://www.google.com/search%3Fhl%3Deng%26q%3Dfortran&hl=en\">
Sign in</A></FONT></TD></TR>
<TR height=4>
...
<TBODY>
<TR>
<TD align=middle>
<FONT size=-1>
<A href=\"/\">Google Home</A> - 
<A href=\"/intl/en/ads/\">Advertising Programs< /A> - 
<A href=\"/services/\">Business Solutions< /A> - 
<A href=\"/intl/en/about.html\">About Google </A>
</FONT> </TD> </TR> </TBODY> </TABLE>
<BR>
<FONT class=p size=-1>ɲ006 Google </FONT> </CENTER>"
cl-user(49): 
The result is a large text string, the HTML of the page our browser is displaying.

This exercise was an illustration of how ACL's OLE tools can be used to investigate a control available on the system and to learn enough to design lisp functions to make use of it. We've discovered how to get our web browser to make a Google search request and how to get the resulting page data as a lisp character string that we can process.

Further investigation would reveal other auto-getf calls that could be used to decompose the structure of that page according to its HTML entities, so that the lisp program could process the contents in a logical order without having to decode HTML syntax.

Copyright © 2014 Franz Inc., All Rights Reserved | Privacy Statement
Delicious Google Buzz Twitter Google+