Scientific Method for Wikimedians/The state-of-the-art

the state-of-the-art this is chapter 7 in the course scientific method for wikimedians. It is located in part two of the course, "the research planw"; which includes three chapters. The first chapter is the previous article "designing research question", then this chapter "the state-of-the-art" one more to do ahead.

In the previous article, Chapter 6 "designing research question", we have covered the first step of the scientific method, we agreed that the research question is not a question in the common sense of the word. But rather it is a document of one to two pages; and it is used to describe the problem shortly for non-experts. It includes three main parts: the title, the objective and motivations and the problem definition where we had described each of them in details before.

In this chapter, we will cover the second step of the scientific method; the state-of-the-art, this chapter includes seven major points we will discuss:
 * 1) First, we will have a short reminder on the scientific method, especially Step 2 of it; as well as, the card reference system we have described in chapter 5.
 * 2) After that we will discuss what you need to include in the state-of-the-art, and how you can detect that.
 * 3) Then, we will have a short section to discuss the objectives of the state-of-the-art. Why do we need it; and how the researcher can make the best out of it.
 * 4) Next we will cover the types of the state-of-the-art; there are two major types; one is used to establish a study, and here the difference between the researcher work and the others work will be clear. This step is just to study others works; while the original work of the researcher who do the study will come next. However, in some cases the state-of-the-art is the objective itself; and this type of scientific works is called a survey. The goal of the survey is just to collect and present others work on a specific domain of human knowledge. It is not an easy task to do; and the problem there is it's difficult to detect what is the original work of the researcher, and what is the work done by the others. These two types are described below.
 * 5) After that we will discuss a rare but possible case; it is when you have no others work on a given subject, and we will explain what to do in this case; and how.
 * 6) Sixth, I will present some errors a lot of researchers do when they conduct their state-of-the-art; and you need to avoid them in all possible ways.
 * 7) Finally we will explain the right way to do the state-of-the-art.

Reminder: Scientific method & card system
as we said before, the starting point is from the scientific method itself. the state-of-the-art is it is second step. Normally, we arrive at this step after the problem question is detected.

In this steps, you need to look further to deepen your knowledge about the problem; and it's completely normal that you slightly change the research question by the end of this step. the term the state-of-the-art means:

It means that you need to get what is our current understanding, regarding the research question you have detected. You need also to see who is currently working on the problem, if he/she exists. What are they doing, simply to avoid repeating the same work that has been already done, we also agreed that the question is completely based on the other's work. And you need to search for these works; check their reliability, and include them in a correct way within your state-of-the-art. you need to understand that the state-of-the-art is not a random collection of articles and studies that covers your research question. No it's not, you need to organize the collected data analyzing it, and then present the kit as a summary of what you have found. Normally, the state-of-the-art covers the first chapter of the thesis, or report research, If you are making one; or the two sections of the introduction and problem definition in the scientific journal articles, and as you know, they are the first two sections of the body of the article and we have covered them in chapter 5. So please refer to it if you need more information on this subject, the result of this step is a reformed or enriched research question. By the end of this step you are expected to have a large understanding of the problem you are facing; and you need to use your obtained knowledge to enhance the research question. I want to refer also to the fact that this step is sometimes underestimated by some young researchers, please note that here is where you need to read others work, and not in later steps. If you discover an important paper that you have missed at the end of your research, it is a problem. The same is true for the case when you discover in the middle of your research that someone has already solved the problem the answer. For the both cases, it is important to take a good time and you do the state-of-the-art, slowly covering all aspects of the problem we have also talked about the reference card system, the main objective to use. It was that you will deal with a large number of scientific works and you cannot simply remember all of them; additionally, it's possible that you need to search for something twice or even more, and it is time wasting to repeat the same search again; if you have already done it once before the reference card system is an efficient way to save time if done correctly. You will have a short summary of what you have found and you can easily go further back to the original document and verify or search deeper. We have also agreed that today the process is completely automated using specific family of software; however, I insisted on doing it the traditional way, because I think that if you learn how to do it manually. You can easily learn any software that perform the process, but the operation cannot be guaranteed in the opposite way, because the software might omit or escape some steps. If you don't know them you will get completely lost, we also had an example of the definition of science and we saw that you need to add the information on the reference on a specific card with a unique reference the card, for example "R15", the quotation and where you can find it which means the reference ID, and the page needs to be added to another card called the quotation card, in this case with a reference "Q3" the key point here is that you have a reference to the reference card in the quotation card, of course you canext end the system further and add cards for authors for example and so on but this is completely how the scope of these scores.

What to include
We can start asking the first major question in this chapter, what to include in the state-of-the-art:
 * first, you need to understand that the human knowledge is vast, and your first target is to know all the sources you are searching within as much as possible. When you are searching using keywords in search engines you will not find lots of studies or articles that addresses the exact same problem directly, it's not impossible, but it is rare to happen. You will mostly find scientific works that address one aspect, or two of the problem you are studying; that's why it's very important that you limit your keyword research to a scientific research domain exactly. For example, if you study the relationship between violence and gender in a given society and you search using keywords in Google Scholar for example; You will find some works related to violence in sport for example, and it is a little bit far away from what you are looking for because you are looking to study the problem from social perspective, and not from the supportive perspective. to keep it simple, The first step is to detect the research domain ; after that, you need to make the research more narrow. Each domain includes a set of Interest fields, and your problem might be attached to one field or two, and scientific works from that fields are closely related to your research, and should have priority and to do that you need to create categories. With all that in mind, you have to start categorizing by theme, where the level of priority is going to be given to the category, and not to the article for example, if you are doing a research on the binary logic of computer, The main research domain is mathematics, it's not physics nor geography. On contrast, mathematics itself is a quite large domain and it includes interest fields, such as algebra and calculus which are far away from what you are looking for. In fact, binary logic is included inside set theory, so the next step here is to detect the field or fields of Interest closely related to your research.
 * After that you need to create the system of priorities, where you give each category a priority level. For example, I will create my system on scale of three, one degree if the subject is outside of the domain; in this case, if it is in physics or in geography. The article will get two degrees if it is from the domain but outside the fields of Interest, for example if the category is in the mathematics but it is in algebra or calculus and not in the set theory. Finally the category will get three degrees if it is from the field or the fields of Interest, and related closely to the research; In this case, the article will get three if it is inside the set theory. Please note, that this is just an example and you can do it differently, you can simply for example give the priorities to the article and not to categories; or you can make the scale out of 5 or out of 10. feel free to be creative but forget not to stay systematic. Having the gender gap in English Wikipedia as an example, and using the same scale on three. please note that I'm not going to detect what is the research domain fields of Interest intentionally I will leave that to you because this will Orient your study completely in different ways and I will let you free to orient your own research so feel free to do so the categories you might find can be studies related to English Wikipedia only and it will take the priority ne studies related to gender gap only and it will take the priority to studies related to gender gap in English Wikipedia and I will give it the priority three on our scale because it is closely related to our research studies related to knowledge Gap only I will give it onestudies related to free content Gap onlyand I will give it one also so in thiscase I will study only the Articles categorized under priority two and threeI hope the idea is clearthis might be strange for you if you are doing this for the first timeand let me explain why in fact if you doa fast researchusing today research engines it's normally to find almost 50 articlesper search sessionyes 50 articlesand that's why you need to limit yourresearch as much as possible remember that you need strictly to use selective readingit's not impossible that you read everything you find but it is not practical selective reading was discussed inchapter 5.if you need to know more stop the videohere and refer to it pleaseanother problem is that you read selectively the same article twice and that is why it's important alsoif you are working with a large amount of article to include authors information in your card system and thus you create a card for each author and you start adding articles to the author card as well and by doing so you can easily detect duplicated articles with all that in mind.

Objectives of the state-of-the-art
We can move forward to talk about the objectives of the state-of-the-art in other wordswhy do we do it and what do we expect to achieve if we do it correctly:
 * The first and the most important thing to verify is that the problem is still open and no one has yet proposed a solution to it. It's important also to check who is working currently on the subject, if you pick a hot topic in the research domain. It's highly probably that someone else is working on it. It's important for you to understand what they are doing, so you don't repeat the same research again. You can also communicate with these people via email showing them what you want to do trying to find a common ground; thus your researchers will complete one another, this is not a competition.
 * Second, it's also important to understand where does your research fits regarding the human knowledge in general, and the research domain you are working in. Specifically, and the best way to do that is by regarding others most recent works; so you can have an objective way to compare your work with normally the open questions are the limits of the human knowledge so by detecting these questions and compare them to your own research question you can easily see where will your research be located.
 * Third and it is one of the most underrated objectives of the state of the Earth by doing the state-of-the-art you will understand better the context of the problem how you can handle it what works and what does not work trying to find solution if you search well in the scientific works closely related to your research you can find that there are some blocked ways for example hypothesis that were tested and proven to be false by doing so you can save time and enhance your design of the research method that we will cover in the next chapter.
 * The fourth objective you will achieve ifyou do the state-of-the-art correctly is to discover other possible research methods in fact if you look to the other scientific works you can see how others address similar problems what are the research methods why do they choose it like this and how you can maybe reuse the same research method or enhance it to adapt to your own research problem and by doing so you can save a huge amount of time and effort so keep inmind that the state-of-the-art is notonly search for information inscientific worksbut it is also a search for researchmethods and planthat you can reuse adopt or enhance tofit your own research.
 * Fifth and it is alsoan underrated objective of the state-of-the-art. the state-of-the-art enables you todiscover a huge amount of sources related to your research problemwe have talked about the structure ofthe scientific workbefore in chapter 5. We have said that each scientific work includes a reference section entities, where the researcher who created the work cite others works, he used to complete his research. This section should be one of your targets, because you will always find alist of Works probably related to your research, and it is ready for you to search for the lists items. In simple words, looking for new works related to your research in the reference section of others; scientific work is a good practice that you need to adapt and do always in the start of your research the objective is to discover others Works related to your research problem. The fixed point is related tothe fifth, if you search in the reference section of others works, you are traveling back in time, and you will have a set of articles that can be changed together to form a timeline, starting from the first time where the question or the problem appear until the last article that shows what is the current achievement toward the solution. thus, one of the state-of-the-art objectives is to create the timeline of the problem, and by doing so you can anticipate based on the timeline you have created; what is next, or what is the achievement next to be expected. Regarding the problem you are addressing, and this is a very valuable piece of knowledge you will need when you are designing your research method later.

Types of the state-of-the-art
now we can move forward to talk about the types of the state-of-the-art. In fact there are two types of the state of the art based on how it's going to be used they are the Stand alone state of Art and the integrated state of Art and I will Define the two by making a small comparison on different criteriathe first Criterion we are going to use is the output regarding the Stand alone state of the art which is also called a survey.

The final output is the objective itselfhere the researcher will not do further research later and he is asked to do only the state of the art maybe other researcher will continuelater or simplythe value of the research will be re-estimated after the state ofthe art is finished and in some cases this is the objective from the beginning on the other hand the integrated stateof art is a part of larger research the objective is to start from the research question and dive deeper and the result is a reformed research question and a better understanding of the problem in simple word the integrated state-of-the-art covers the first part or a chapter of the scientific work and it is followed by other steps discussed previously inchapter 4.


 * The second Criterion is the contribution of the research which means what is the added value of these work we have agreed before that the scientific work should create a new knowledge regarding the Standalone state of the art the added value is limited in fact the researcher is working completely on results obtained by the others and there is no new results he will achieve however the value of this type of works is that they include not only sorting and classification of others work but also and analyzes of what are the open questions regarding the research problem what are the current achievement and what are the current direction of researchin this domain it might not be a large contribution but it is valuable and time saving. On the other hand, the contribution of the research where the integral state of the art is located is clear and completely independent of it. in fact the state-of-the-art here is only the base over which the original research will be built in later steps.


 * The third Criterion is the data type the research is based on and I said research and not the state ofthe art because we have agreed that the state-of-the-art is based completely on the other's work so this criterion covers the steps that follows it because the Stand alone state-of-the-arts are completely based on others work with no further steps the data used in this research is completely bibliographic and this means that you can find it in physical library or online, and as we said before the bibliography will include only articles or scientific Works produced by other researchers. On the other hand, the research based on the integrated state-of-the-art will be based on original findings may be obtained from nature through experts or from a simulation created in alaboratory or simply from online question form published on the internet. Clearly, there will be data obtained from the physical world, and it will be added to the researchsomehow laterand finally.
 * The fourth Criterion is the publicationin the Standalone state-of-the-art the final product of the researchis called a survey article a survey is Rich with informationon others work and it is a great placeto start your own research however as we said before the original contribution of these type of Articles is limited the survey are also peer reviewed but they have strict conditions to be accepted and normally a high percentage of rejection we have talked about the peer review process in chapter 5 of this course, So if you are interested to know more about it please refer to that chapter. On the other hand the integrated state-of-art will be published later or with other findings.

If the output is a journal article the state-of-the-art might not be published completely, and due to lens restriction so it will be appearas two to three paragraphs maybe more at the top of the journal article and if the work is a phases or a research report the state-of-the-art will be completely added to become the first chapter of these work.

Special case: No previous studies
now let's address another topic related to the state-of-the-art it is what do you do when there is no previous studies related to you rresearch problem well let me analyze this case in several parts.
 * 1) First, it is rare very rare that you address a question for the first time in the human history, we have said that a human knowledge grows slowly and it is unusual case when you have something that is completely new and cannot attached to the Cure and understanding of the universe. However, I said very rare and I did not say it impossible, here is a nice warm story that shows the exception I am not sure if you are familiar with the name John Nash, but I think you have watched it or at least heard of the American Film "Beautiful Mind" it won the Oscars in the 1990. the film is based on a true story for a man who has one Nobel prize in economics  for his contribution in something called "The Game Theory". Nash was a PhD student when he created anew branch of knowledge that fits in-between economy and Mathematics, and it was called The Game Theory. It is a way to understand a human behavior not only in economy but also in real life wrought his thesis, he was the first human being ever to address this subject so there were no scientific works nor previous references a normal PhD thesis can include 100 to 200 references and in some cases it can include Wars it is rare that atheist is approved if it has less references than this. however, Nash thesis has only two references, and here is the image for the bibliography section in his stages shown on the right side of the screen, so in simple words it is rare but a possible case.
 * 2) The second point I want to cover here is the availability of sources in aspecific language for exampleif you are doing Islamic Studies it is out of question that you need to master Arabic as almost all primary sources are written in Arabic if the idea here is to consider that sources might be available in different languages and not only in English and in some cases they are only available in other languages.
 * 3) The third point you need to consideris that some original works are not free and you need to pay a fee to access them so in this case you cannot say that there is no references if they exist but you need to pay for them and you need to consider the fees in your research plan.
 * 4) fourth, some sources are classified by governments or private sector and you might need the permission to access these documents or our Chiefs, so if the sources are classified they exist and you need to consider having the permission if possiblein your research plan, but what ifyour research does not fit in any of these cases and there is really no sources related to your research problem wellin this case you need to do the followings first a global timeline of the problem. this means that you need to show how you arrive here in the first place for example what are the discoveries that made this problem open if you think this is hard try to go back in time maybe 10 years or 20 then 50 and 100 years and ask yourself why does scientists had not asked a question about this problem before try to create a chain of events or discoveries of Technologies that made the study of this problem possible for example 100 years ago why did scientists did not study the artificial intelligence or simply the AI well you can easily trace this problem back to the creation of computers in the 60s and 70s of the last century and if you go back further 10 or 20 years more the discovery of the semiconductors and the creation of the transistors were Essentials so in simple words you need to show how did you arrive here in the first place from time perspective the second aspect you need to coveris related to the place or simply where does this problem evolveda nd whyis it related to geography for example discovering the oil in the Middle East had a huge impact on the development of the region in all aspects where the impact of this discovery was limited in other places of the world so in the second aspectyou need to cover the space where thep roblem was found and try to understand how does this place had influence on the problem itself the third aspects to cover is related to who created the problem or Discover it and what was his intention in the first place how does he arrive to the problem what did he do did he have other related works it's highly recommended that you finish this aspect after you finish the time and place aspects because the two will provide you with a general knowledge that can make your research much more easier fourth you need to focus on the problem itself what is it how we can describe it is it related to physics or to mathematics maybe it is laws of humanities so you need to detect the domain of thehuman knowledge close to your problem after that you need to search for thefield of interest within that domain so in simple world here you need to show where is the problem located regarding the human knowledge and we have talked about that before in this video.

Finally after doing all of these you can writedown a section based on what you have got and forget not to explain that you have not find any relative sources in most cases you will discover plenty of sources while doing this previously described research thus the problem will be automatically solved

What to avoid!
before we move to the final section of these chapter, I would like to highlight some major errors and mistakes you need strictly avoid while doing your state-of-the-art. I have prepared the list so you can copy it and check it every time you do a state-of-the-art:


 * 1) The first thing you need to verify is the state-of-the-art is not just listing everything you have find regarding the research question; this is not the objective, you need to be creative detecting the major themes and how they are connected to each other. That is why it is important to describe each scientific work you will find in a few keywords that can represent the major themes of the work. This also means that you need to classify, sort, and analyze the collected data, we have talked about these three operations before in the previous chapter.To have a brief reminder, classify means to create groups of articles that share something in common, for example the error when they we republished or any other classes that might serve you. Second, sorting means to create an order within each group and when talking about articles it's mainly chronological order, but it is possible to do it using anyother Criterion that might serve your research. Finally, analysis means to extract new findings from the results of the first two steps, which are classifying and sorting back to the common mistakes.
 * 2) The second common mistake you need to avoid is that the state-of-the-art is not creating abstracts or summaries of other works related to your research problem; In many cases, you might find somethingproblematic in the research methodor a lack of coherence in the resultsthemselvesso you need to refer to that showing howto find it and why you think there issomething to fixand you need by all means avoid addressing the authors of the work. Just focus on the work itself and not only authors.
 * 3) The third common mistake you need to avoid is that the state-of-the-art is not a bibliography section, and it should not be never add bibliographical information to this section; such as the publisher name and the page number, the correct place for this type of information is the bibliographical section. At the end of research report phases or article and they should not appear in the state-of-the-art. However, in many cases you can mention the year of the publication, especially when you are trying to createa chronological order or a timeline, you can also use the author; second name to refer to the work itself instead of the title, it is completely fine. But remember again that you need to address the work and not the author, if the author has issued several works and you need to use them all in your research, you can refer to the work using the second name of the author and the year it was published, and if it is still not enough to uniquely distinguish the work. for example, in the cases where the author has published two articles in the same year, and you need to use the tool in your research work. In this case, you can mention the media where the work was published. For example, the confidence name or the journal name and so on.
 * 4) The fourth common mistake you need to avoid by all means is something I have mentioned before twice and I am repeating it for the third time to insist on the importance of this mistake you need to be careful when mentioning others work especially in negative way read the work several times objectively before having a final judgment you need also to search for later versions of the work or maybe later works for the same author. It's highly probably that the author himself might have found the error and corrected itwhen you do that you need to avoid generalization by all means read carefully every work you have and search for the details before generalization of your opinion or creating a feedback or an overview especially if it is negative. Try also to check if there is any other study that adopt the same point of viewor goals against it, search why in both of the cases and remember always to focus on the work itself and not on the authors knowing what to avoid make it easier to detect what you need to do.

How to do the state-of-the-art
in this final section of chapter 7, let me provide you with four steps, you need to follow in order to correctly do the state-of-the-art:
 * 1) First, you need to collect as much as possible of studies and scientific works, related to your research question with today's technology, you have to use a generic research search engine, such as Google Scholar; you can also if you are familiar with, the research domain use a specific research engine related to the domain of research, where you are doing your study. You need to use two skills we have covered in chapter 5. Selective reading and the reference card system, the use of selective reading is mandatory in fact, soon after you start your study, you will have hundreds of pages to read, and the majority of them is expected to be out of the interest of your research. Thus, it is highly recommended to use the selective reading to remove any relative content as fast as possible. So you save the time, on the other hand after you have detect what is relative to your research question; you need a fast flexible way to refer to the works. Additionally, you need to create a list of keywords that covers the major aspects of each of the selected works, and here you can use the card system that satisfy the two needs I have just mentioned; we have covered the two skills in details in chapter five. If you are interested to know more, please refer to chapter 5 of this course.
 * 2) The second thing you need to do is to define the outlines of the state-of-the-art; outlines are the major themes covered by the Articles or phases or any scientific works you have collected in the previous step. soon after you get some articles, close to the subject of your research problem and you start reading and analyzing them; you will notice a pattern of themes, and you need to detect these patterns. A good state-of-the-art is structured based on themes not on authors nor own works, it is a mistake to build your state-of-the-art on the last two; however, after you detect the themes, you can analyze each of them, based on the works. This is completely fine, other approach in some cases is to analyze by authors, in this case, you have major developers of the solution or simply major research institution that have been working for a while on the program. The time they spent dealing with a problem is enough that they create their own research approach, and in this case presenting the works, you have collected in the state-of-the-art based on authors or on the research institution is the best way to goyou need to keep in mind that outlines are not a fixed structure. Instead they are dynamic, the more you go further, the more precise the outlines will become, and in many cases you need to drop some outlines out of the list; because you have discovered that they are out of the scope of your research, in simple words build adjustable list of outlines.
 * 3) The third step you need to follow is not only to read the works you have found, but also to scientifically criticize them; this means that you need to apply the critical thinking to everything you have read. You have to show an independent scientific thinking that reflects your knowledge and understanding of the problem and the works related to your research. It's highly recommended that you present all valuable and reliable points of view, you have collected regarding the research problem; you can criticize them all if you a reable to prove that scientifically; this is completely fine, with all that done you can now do.
 * 4) The fourth and final step you need to analyze the collected data; you have presented in the previous step, the analysis is the last and the most important step of the state-of-the-art, it is where you try to explain why there is different opinions on the studied subject;'and how you will use that to enhance the research question. You have already the last thing you need to do is to add the research question. At the end of the state-of-the-art, we are not talking about the old question, you have created in the beginning, instead you need to modify your research question based on your findings in this step. It is important also to have a research question that corresponds to the state-of-the-art, this means that the research question should be the logical finding that you have obtained thanks to the state-of-the-art and that's why you need to update it and change it.

General notes on the state-of-the-art
Before finishing this chapter, it is important to gather all previous in four General overview of the state-of-the-art:
 * 1) First, the state-of-the-art is the second step in the scientific method; it starts from the research problem, you have defined in step one and it ends with an updated version of the researchq uestion. In this step, you are expected to search for the current achievements. Regarding your research problem, you need to search only in the reliable peer review scientific sources.
 * 2) Second the state-of-the-art needs to be limited by a knowledge domain and Fields of Interest a good state-of-the-art is strictly close to the research question as muchas possible so before you do your state-of-the-art you need to detect the knowledge domain where it belongs and the closest fields of interest to your problem within the knowledge domain.
 * 3) Third, there are two main types of the state-of-the-art; based on what is the results obtained in the first place. The first type is the Stand-alone state-of-the-art; which is usually called a survey; it is an overview of a research question, that is published alone with no further research at attached to it. It might be used later as a base for a research, but the survey is published as it is without further research. the second type is the integrated state-of-the-art, it is published within another scientific work; it's a part of a larger document, such as a research report or PhD thesis.
 * 4) Fourth, the state-of-the-art is not a list of previous study, nor a list of abstract and summaries; it is a place where you need to analyze others work and criticize them using a critical thinking and logic. The state-of-the-art is also not a bibliographical section, and should not be treated as a such, never add authors names, Journal names or page number, in the body of your work within the state-of-the-art. However, in some cases you might need to use the author name or the name of the journals to uniquely refer to a scientific work or to differentiate similar works one from another;

Finally the state-of-the-art includes four sub steps, first, collect as much as possible of scientific Works related to your research question. Second determine the outlines which are the major themes related to the topic you are doing your research on; you need then to classify the work you have collected, and after that to sort this Work within each team using specific criterion you have chosen. Third, you need to criticize using critical thinking; forget not to be objective and to criticize the scientific work and not to the authors. finally, fourth you need to analyze; this means you have to extract findings, from sorted data you obtained in the previous steps; and forget not to update the research question at the end of your state-of-the-art.