This post is updated with Selenium 4 changes in WebDriver hierarchy.
There are many important classes and interfaces in Selenium WebDriver. In this post we will see hierarchy of classes and interface related to WebDriver interface.
Understanding “Why” of a line of code is very much important and if you copy code from others without understanding, that is more harmful. I see people will downcast a WebDriver reference to TakesScreenshot type to capture a screen shot and downcast a WebDriver reference to JavascriptExecutor to execute a Javascript query but not able to explain the reason behind it. So in this post, I will try to explain the hierarchy of WebDriver so that you can easily understand when to up cast and downcast or if it is needed at all.
Let’s start with some basic concepts of java :-
- We can not create an object of interface, i.e. we can not instantiate an interface.
- Interface has abstract methods means they do not have body, only declaration. Note – From JDK 1.8 there are many changes in Interface. WebDriver interface has no static or default methods as of now.
- When we create an object of a class and store it or assign it a reference of its super class or interface then it is called Up casting. We can not do reverse i.e a child type reference can not hold super type object.
Below is a Java Program which demonstrates behavior when up casting is done and why need to down cast.
Java Program
package SpecialConcepts; class SuperClass1 { public void super_print() { System.out.println("Super Print"); } public void super_show() { System.out.println("Super Show"); } } public class SubClass1 extends SuperClass1 { public void sub_print() { System.out.println("Sub Print"); } public void sub_show() { System.out.println("Sub Show"); } public static void main(String[] args) { /* Child class object up cast to super class reference. Using super class reference , sub class object can not use methods of sub class despite of being object of that class. Up casting restricts access or visibility of methods downwards */ SuperClass1 superClass1 = new SubClass1(); superClass1.super_print(); superClass1.super_show(); /* To access sub class methods, super class reference needs to be downcast to sub class reference. Note here we can down cast to a up cast reference only. */ SubClass1 subClass1 = (SubClass1)superClass1; subClass1.sub_print(); subClass1.sub_show(); } }
Output
Super Print Super Show Sub Print Sub Show
There is an important point. If a super class method is overridden in sub class then up cast reference will use overridden method without downcast.
package SpecialConcepts; class SuperClass2 { public void print() { System.out.println("Super Print"); } } public class SubClass2 extends SuperClass2 { public void print() { System.out.println("Sub Print"); } public static void main(String[] args) { SuperClass2 superClass1 = new SubClass2(); superClass1.print(); } }
Output
Sub Print
WebDriver Hierarchy Diagram
Let’s understand above diagram in details:
Interface SearchContext
SearchContext is the top most interface in WebDriver hierarchy. This interface consists of two methods findElement(By by) and findElements(By by). This interface is extended by both WebDriver and WebElement interfaces.
Interface WebDriver
WebDriver is an interface which extends SearchContext interface. As per official document :- WebDriver is a remote control interface that enables introspection and control of user agents (browsers). The methods in this interface fall into three categories:
- Control of the browser itself
- Selection of WebElement and WebElements
- Debugging aids
Currently, you will need to instantiate implementations of this interface directly. It is hoped that you write your tests against this interface so that you may “swap in” a more fully featured browser when there is a requirement for one ( i.e. Up cast to WebDriver). Most implementations of this interface follow W3C WebDriver specification
WebDriver interface has multiple inner interfaces which contains methods related o specific events.
- ImeHandler – An interface for managing input methods.
- Navigation – An interface for provide mechanism to access browser history.
- Options – An interface for managing stuff you would do in a browser menu
- TargetLocator – Used to locate a given frame or window.
- Timeouts – An interface for managing timeout behavior for WebDriver instances.
- Window – An interface to manage browser window actions like maximize, minimize etc.
Class RemoteWebDriver
RemoteWebDriver is the fully implemented class i.e. non -abstract class which implements WebDriver interface. All implementations of WebDriver that communicate with the browser, or a RemoteWebDriver server shall use a common wire protocol. This wire protocol defines a RESTful web service using JSON over HTTP.
Note here that RemoteWebDriver has implementation to call RESTful Web Services of Selenium WebDriver which hits to respective browser server. It has no mechanism to perform any browser specific action.
It also implements other interfaces. JavascriptExecutor and TakesScreenshot are one of them. Implementing JavascriptExecutor ensures that driver can execute JavaScript commands by providing access to the mechanism to do so. Implementing TakesScreenshot ensures that a driver that can capture a screenshot and store it in different ways.
Browser’s specific classes
Then we have browser specific driver classes like ChromeDriver(), EdgeDriver(), FirefoxDriver() etc in hierarchy. These classes provides control on a browser running on the local machine. These class are provided as a convenience for easily testing the browsers. The control serves (which is started every time when we launch a browser ) which each instance communicates with will live and die with the instance.
So now we have a fair idea about hierarchy and basic understanding of each level in hierarchy. Let’s understand some important lines of code which we use always but may not know reason behind that.
WebDriver driver = new ChromeDriver(); TakesScreenshot takesScreenshot = (TakesScreenshot)driver; File screenshot = takesScreenshot.getScreenshotAs(OutputType.FILE);
Why did we down cast WebDriver reference to TakesScreenshot?
Because earlier ChromeDriver() object was hold by WebDriver type i.e. up cast. TakesScreenshot interface comes down in WebDriver hierarchy. As we have seen in above program that a up cast object has no visibility of non-overridden method of sub class. To capture a screenshot we need to use getScreenshoAs() method of TakesScreenshot interface. So to use method we must need to down cast to either RemoteWebdriver or TakesScreenshot interface.
WebDriver driver = new ChromeDriver(); RemoteWebDriver remoteWebDriver = (RemoteWebDriver)driver; File screenshot1 = remoteWebDriver.getScreenshotAs(OutputType.FILE);
Similar concept applies when we need to execute any Javascript command then we need to downcast to JavascripExecutor or RemoteWebDriver.
WebDriver driver = new ChromeDriver(); JavascriptExecutor javascriptExecutor = (JavascriptExecutor)driver; javascriptExecutor.executeScript("arguments[0].click()",driver.findElement(By.id("Some Id")));
Remember here if we do not up cast , we no need to down cast.
ChromeDriver chromeDriver = new ChromeDriver(); File screenshot = chromeDriver.getScreenshotAs(OutputType.FILE); chromeDriver.executeScript("arguments[0].click()",driver.findElement(By.id("Some Id")));
You can download/clone above sample project from here.
If you have any doubt, feel free to comment below.
If you like my posts, please like, comment, share and subscribe.
#ThanksForReading
#HappyLearning
Find all Selenium related post here, all API manual and automation related posts here and find frequently asked Java Programs here.
Many other topics you can navigate through menu.
Amod, This is my favourite article in your selenium blog.
wow you cleared my concept. God bless you Amod.
Amod..!! i have been into QA automation since very long but the way this blog has been articulated is very apt and in an understandable manner. Very information and impressive blog 🙂
Thanks Amod. It’s great depth knowledge for Basic flow..
Can u please explain findElement and By Class in the same manner and demystify it.. Thanks for the help
thnaks, it was quick and useful.
Thank you Thank you Thank you for such a lovely post. Maaza aa gaya 🙂 Keep it up bro!.
Can you please explain this
“Keeping 2nd point in mind, We can upcast till WebDriver as we are not making it difficult to use important methods because of major concept of overriding in java. So, we do not upcast to RemoteWebDriver.”
If we upcast to RemoteWebDriver its a fully implemented class which has access to all methods that are there even at WebDriver level. So why we did not upcasted only at RemoteWebDriver level.
Anyhow WebDriver is an interface so it does not have any method implemented. If we could have upcasted til RemoteWebDriver. Was it not enough ?
Yes you can do. There is no harm. Suppose if a new implemented class of WebDriver comes in to picture and has no connection from RemoteWebDriver. If you have upcasted till RemoteWebDriver, you can use the functionalities of new class. BUt if you would have upcasted till WebDriver, you can use functionalities of new class as well. Let me know if you get a point here.
I’m late here, but I’ll throw an explanation I found online.
Firstly, unfortunately what he said seems wrong (I mean, the “why” is wrong). Method Override has nothing to do with using WebDriver in place of RemoteWebDriver (and there also is no override involved, either).
The whole explanation is this:
1) we make an upcast to WebDriver so we can achieve cross-browser compatibility.
In fact, if we instantiated a single browser object (which we can do, since those are classes too), we would then need to create a different object for any new browser we’d need to run our tests into.
ChromeDriver driver = new ChromeDriver();
InternetExplorerDriver driver2 = new InternetExplorerDriver();
etc…
Instead, using WebDriver we can achieve this purpose in a much simpler way:
WebDriver driver = new ChromeDriver();
driver.close();
driver = new FirefoxDriver();
This is an example of Polymorphysm. In fact, being WebDriver an interface, it has got, in form of signatures, all methods which are then implemented in their respective classes, like ChromeDriver, FirefoxDriver, and so on. Thus, whenever we create an object of one of these classes, the compiler automatically calls the right constructor to instantiate it.
2) why the upcast is to WebDriver and not RemoteWebDriver:
it seems that with WebDriver, we can simply run our tests on our Local Machine, while when we use RemoteWebDriver we run our tests on a Remote Machine, and we need to tell the new object created, where the Selenium Standalone Server is.
example of RemoteWebDriver use:
DesiredCapabilities capabilities = DesiredCapabilities.firefox();
capabilities.setCapability(“marionette”, true);
capabilities.setCapability(“networkConnectionEnabled”, true);
capabilities.setCapability(“browserConnectionEnabled”, true);
WebDriver driver = new RemoteWebDriver(new URL(“http://localhost:4444”), capabilities);
I found these explanations online, but they make more sense to me than this one.
A nice effort though, and the pictures were very useful to me. I’m writing my thesis on Selenium WebDriver.
Thanks for the wonderful post Amod. It is very well written and thoroughly explained!!
Thank you very much for nice explanation for understanding in-depth.
Thanks Harish.
Very Useful explanation !!!
Keep posting …Thank you so much!!
Thanks Prateek.
Very well explained. Thanks a lot.
Thanks a lot very basic thinks explanations
Thanks Surabhi.
Hi Amod,
You are doing a fabulous job. I am following each and every post of yours since you explained very basic things that are not readily available anywhere else.
I wanted to understand the connection between WebDriver and WebElement interfaces as both contains findElement(s) methods which are implemented by their resp. classes, so why we need implementation twice and what is the difference. Please explain.
Big Thanks,
Sweta
Hi Sweta,
FindElement methods in both WebDriver and WebElement do the same thing but in different context. findElement of WebELement helps you to find with respect to current element.
For ex: WebElement ele1= some locator
WebElement ele2= ele1.findElement(….)
Thanks
Amod
So much knowledge in your every post..I am following your every blog for past few weeks and learning new things each time
Extremely useful !! Very Thankful to you
Thanks Tirath.
Really an very useful information.it clears many doubts…please keep on sharing your knowledge which helps the people like us..
Thank Amod..
Sure Mallikarjun.
Could you please explain this point in little detail sir. I did not understand in above section:
RemoteWebDriver driver= new FirefoxDriver();
why this is not preferable: RemoteWebDriver driver= new FirefoxDriver();
Hi Pavan,
I will publish a post soon on this.
ok thank you.
Hi Pavan,
RemoteWebDriver is super class of any browser class. So you can up cast a browser driver class object to its super class RemoteWebDriver .
Well explained concept.Great Job.
It would be very helpful if you could add code snippet for the explanation for better/easy understanding.
Thanks Siva. Will take care of your feedback.
I have a question here, as all the methods are implemented in remotewebdriver class, why cant we directly call methods from remote webdriver class itself??.Why are we accessing it through webdriver???……I tried accessing remotewebdriver class, but the error says, counstructors are not visible, Since it is a protected class. Could you please clarify me on that??.
Thanks in advance
Hi Divya,
RemoteWebDriver works as a base class for all browser classes. RemoteWebDriver is mainly useful when you run your scripts in a grid. Default constructor of RemoteWebDriver is not visible outside package because it is protected. You need to use setup browser in a grid and pass statement as below:
WebDriver driver = new RemoteWebDriver(new URL(“http://localhost:
4444/wd/hub”), DesiredCapabilities.firefox());
Thanks
Amod
thanx for such clear explanation
Thansk.
Hi Amod Sir,
I have a doubt that if we upcast only upto Remote WebDriver then there will be no need for downcasting again for Takescreenshot and JavaScript Executor as it is fully implemented class(methods from SearchContext and WebDriver interface)
and also it is mentioned that upcast the object to maximum level possible keeping in mind that you should not loose important functionalities. But if we are upcasting to WebDriver then we are loosing some functionalities. Please help in understanding here if I am wrong.
Hi Kavya,
There is java concept called Runtime polymorphism which is achieved when you upcast to WebDriver. RemoteWebDriver is a class.
I dont understand the above comment sir. Runtime polymorphism can be achieved using classes also. Not require an Interface always. please clarify me.
Upcasting browser driver class object to WebDriver is example of achieving run time polymorphism.
Really it’s an amazing explanation. It’s one of the best blogs I’ve ever seen.
Thanks Reddy.
Really appreciate for the information provided. Thanks a lot.
Thanks Vinit
Thanks..its good overview…
Thanks Mohan.
Really Helpful Contents. This post has Clarified many doubts .Thank you So much Amod.
Thanks Gaurav. 🙂
i have read so many post online but the way you explain and depth you goes in is awesome.. One day your blog will be one of the best because of the content you have.
Keep up the great work!!!!
Thanks for great compliments.
Truly said Gaurav. Thanks Amod sir for sharing the knowledge