Access the full text.
Sign up today, get DeepDyve free for 14 days.
A. Helwan, M. Ma'aitah, Hani Hamdan, D. Ozsahin, Ozum Tuncyurek (2021)
Radiologists versus Deep Convolutional Neural Networks: A Comparative Study for Diagnosing COVID-19Computational and Mathematical Methods in Medicine, 2021
O. Wahab, R. Cohen, J. Bentahar, H. Otrok, A. Mourad, Gaith Rjoub (2020)
An endorsement-based trust bootstrapping approach for newcomer cloud servicesInf. Sci., 527
S. Saab, D. Shen, Mohamad Orabi, D. Kors, Rayana Jaafar (2022)
Iterative Learning Control: Practical Implementation and AutomationIEEE Transactions on Industrial Electronics, 69
Mohamad Awada, F. Srour, Issam Srour (2021)
Data-Driven Machine Learning Approach to Integrate Field Submittals in Project SchedulingJournal of Management in Engineering, 37
G. Palmieri, M. Palpacelli, M. Battistelli, M. Callegari (2012)
A Comparison between Position-Based and Image-Based Dynamic Visual Servoings in the Control of a Translating Parallel ManipulatorJ. Robotics, 2012
Samer Saab, Michael Hauser, A. Ray, S. Saab (2021)
Multivariable Nonadaptive Controller DesignIEEE Transactions on Industrial Electronics, 68
R. Ardeshiri, M. Khooban, A. Noshadi, N. Vafamand, M. Rakhshan (2020)
Robotic manipulator control based on an optimal fractional-order fuzzy PID approach: SiL real-time simulationSoft Computing, 24
Florian Pokorny, D. Kragic (2013)
Classical grasp quality evaluation: New algorithms and theory2013 IEEE/RSJ International Conference on Intelligent Robots and Systems
Siyuan Zhang, Yifan Wang, Jiayao Jiang, Jingxian Dong, Weiwei Yi, Wenguang Hou (2021)
CNN-Based Medical Ultrasound Image Quality AssessmentComplex., 2021
Bilal Tay, A. Mourad (2020)
Intelligent Performance-Aware Adaptation of Control Policies for Optimizing Banking Teller Process Using Machine LearningIEEE Access, 8
J. Bagnell, Felipe Cavalcanti, Lei Cui, T. Galluzzo, M. Hebert, M. Kazemi, Matthew Klingensmith, Jacqueline Libby, T. Liu, N. Pollard, M. Pivtoraiko, Jean-Sebastien Valois, Ranqi Zhu (2012)
An integrated system for autonomous robotics manipulation2012 IEEE/RSJ International Conference on Intelligent Robots and Systems
D. Morrison, Peter Corke, J. Leitner (2018)
Closing the Loop for Robotic Grasping: A Real-time, Generative Grasp Synthesis ApproachArXiv, abs/1804.05172
Bashar Haddad, Samuel Dodge, Lina Karam, Nital Patel, M. Braun (2020)
Locally Adaptive Statistical Background Modeling With Deep Learning-Based False Positive Rejection for Defect Detection in Semiconductor UnitsIEEE Transactions on Semiconductor Manufacturing, 33
Henrik Christensen, Peter Corke (2003)
Visual ServoingInt. J. Robotics Res., 22
R. Megalingam, R. Raj, T. Akhil, Akhil Masetti, Gutlapalli Chowdary, Vignesh Naick (2020)
Integration of Vision based Robot Manipulation using ROS for Assistive Applications2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA)
Alberto Rodriguez, M. Mason, S. Ferry (2011)
From caging to graspingThe International Journal of Robotics Research, 31
Yingbai Hu, Xinyu Wu, P. Geng, Zhijun Li (2019)
Evolution Strategies Learning With Variable Impedance Control for Grasping Under UncertaintyIEEE Transactions on Industrial Electronics, 66
Bruno Sauvet, François Lévesque, SeungJae Park, P. Cardou, C. Gosselin (2019)
Model-Based Grasping of Unknown Objects from a Random PileRobotics, 8
A. Koubâa (2016)
Robot Operating System (ROS): The Complete Reference (Volume 7)Robot Operating System (ROS)
Rayana Jaafar, S. Saab (2018)
Sliding Mode Control with Dirty Derivatives Filter for Rigid Robot Manipulators2018 9th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)
N. Correll, Kostas Bekris, D. Berenson, O. Brock, A. Causo, Kris Hauser, K. Okada, Alberto Rodriguez, Joseph Romano, Peter Wurman (2016)
Analysis and Observations From the First Amazon Picking ChallengeIEEE Transactions on Automation Science and Engineering, 15
Tejas Borkar, Felix Heide, Lina Karam (2019)
Defending Against Universal Attacks Through Selective Feature Regeneration2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
A. Loría (2016)
Observers are Unnecessary for Output-Feedback Control of Lagrangian SystemsIEEE Transactions on Automatic Control, 61
S. Hutchinson F. Chaumette
Visual servoing,Springer Handbook of Robotics
Evan Shelhamer, Jonathan Long, Trevor Darrell (2014)
Fully convolutional networks for semantic segmentation2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Evan Fakhoury, P. Culmer, B. Henson (2017)
The impact of visual cues on haptic compliance discrimination using a pseudo-haptic robotic system2017 IEEE International Conference on Robotics and Biomimetics (ROBIO)
H. Abdellatef, M. Hani, N. Shaikh-Husin, Sayed Ayat (2021)
Accurate and compact convolutional neural network based on stochastic computingNeurocomputing, 471
Andrew Nader, D. Azar (2020)
Searching for activation functions using a self-adaptive evolutionary algorithmProceedings of the 2020 Genetic and Evolutionary Computation Conference Companion
C. Lúa, Claudia García, S. Gennaro, B. Castillo-Toledo, María Morales (2020)
Real-Time Hovering Control of Unmanned Aerial VehiclesMathematical Problems in Engineering
D. Azar, Rebecca Moussa, Georges Jreij (2018)
A Comparative Study of Nine Machine Learning Techniques Used for the Prediction of DiseasesInternational journal of artificial intelligence, 16
Shi Qiu, D. Lodder, Feifan Du (2020)
HGG-CNN: The Generation of the Optimal Robotic Grasp Pose Based on VisionIntelligent Automation and Soft Computing, 26
Philippe Saadé, R. Jammal, Sophie Hayek, Jonathan Zeid, O. Falou, D. Azar (2018)
Computer-aided Detection of White Blood Cells Using Geometric Features and Color2018 9th Cairo International Biomedical Engineering Conference (CIBEC)
Bassel Abouzeid, G. Hasbani, J. Abouzeid, Souheil Halleit, K. Jallad (2020)
Cost comparison of gynecologic procedures between the US and a developing country: an observational studyJournal of Robotic Surgery, 16
Samer Saab, Raffi Toukhtarian (2015)
A MIMO Sampling-Rate-Dependent ControllerIEEE Transactions on Industrial Electronics, 62
Sawsan Abdulrahman, Hanine Tout, A. Mourad, C. Talhi (2021)
FedMCCS: Multicriteria Client Selection Model for Optimal IoT Federated LearningIEEE Internet of Things Journal, 8
Jacob Varley, Jonathan Weisz, Jared Weiss, P. Allen (2015)
Generating multi-fingered robotic grasps via deep learning2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
P. Bender, G. Bone (2004)
Automated Grasp Planning and Execution for Real-World Objects using Computer Vision and Tactile ProbingInt. J. Robotics Autom., 19
S. Hutchinson, Gregory Hager, Peter Corke (1996)
A tutorial on visual servo controlIEEE Trans. Robotics Autom., 12
Gangqi Dong, Z. Zhu (2016)
Incremental inverse kinematics based vision servo for autonomous robotic capture of non-cooperative space debrisAdvances in Space Research, 57
S. Saab (2019)
An optimal stochastic multivariable PID controller: a direct output tracking approachInternational Journal of Control, 92
D. Shen, S. Saab (2021)
Noisy-Output-Based Direct Learning Tracking Control With Markov Nonuniform Trial Lengths Using Adaptive GainsIEEE Transactions on Automatic Control, 67
O. Wahab, J. Bentahar, H. Otrok, A. Mourad (2019)
Resource-Aware Detection and Defense System against Multi-Type Attacks in the Cloud: Repeated Bayesian Stackelberg GameIEEE Transactions on Dependable and Secure Computing, 18
(2010)
Advanced Industrial Control Technology
Hanine Tout, N. Kara, C. Talhi, A. Mourad (2019)
Proactive machine learning-based solution for advanced manageability of multi-persona mobile computingComput. Electr. Eng., 80
J. Till, Caroline Bryson, S. Chung, A. Orekhov, D. Rucker (2015)
Efficient computation of multiple coupled Cosserat rod models for real-time simulation and control of parallel continuum manipulators2015 IEEE International Conference on Robotics and Automation (ICRA)
Yingbai Hu, Guanglin Li, Peijiang Yuan, Chenguang Yang, R. Song (2017)
Development of Sensory-Motor Fusion-Based Manipulation and Grasping Control for a Robotic Hand-Eye SystemIEEE Transactions on Systems, Man, and Cybernetics: Systems, 47
S. Saab, Rayana Jaafar (2019)
A proportional-derivative-double derivative controller for robot manipulatorsInternational Journal of Control, 94
Philipp Schmidt, N. Vahrenkamp, Mirko Wächter, T. Asfour (2018)
Grasping of Unknown Objects Using Deep Convolutional Neural Networks Based on Depth Images2018 IEEE International Conference on Robotics and Automation (ICRA)
Hongxing Wei, Zhenzhou Shao, Zhen Huang, Renhai Chen, Yong Guan, Jindong Tan, Z. Shao (2016)
RT-ROS: A real-time ROS architecture on multi-core processorsFuture Gener. Comput. Syst., 56
S. Saab (2017)
A stochastic PID controller for a class of MIMO systemsInternational Journal of Control, 90
G. Kootstra, M. Popovic, J. Jørgensen, K. Kuklinski, K. Miatliuk, D. Kragic, N. Krüger (2012)
Enabling grasping of unknown objects through a synergistic use of edge and surface informationThe International Journal of Robotics Research, 31
M. Lopes, J. Santos-Victor (2005)
Visual learning by imitation with motor representationsIEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 35
S. Saab (2017)
Development of multivariable PID controller gains in presence of measurement noiseInternational Journal of Control, 90
D. Shen, Niu Huo, S. Saab (2021)
A Probabilistically Quantized Learning Control Framework for Networked Linear SystemsIEEE Transactions on Neural Networks and Learning Systems, 33
Clemens Eppner, S. Höfer, Rico Jonschkowski, Roberto Martín-Martín, Arne Sieverling, Vincent Wall, O. Brock (2016)
Lessons from the Amazon Picking Challenge: Four Aspects of Building Robotic Systems
Shehan Caldera, A. Rassau, D. Chai (2018)
Review of Deep Learning Methods in Robotic Grasp DetectionMultimodal Technol. Interact., 2
A. Almeshal, M. Alenezi (2018)
A Vision-Based Neural Network Controller for the Autonomous Landing of a Quadrotor on Moving TargetsRobotics, 7
Yuping Zeng, Jing Sheng, Ming Li (2018)
Adaptive Real-Time Energy Management Strategy for Plug-In Hybrid Electric Vehicle Based on Simplified-ECMS and a Novel Driving Pattern Recognition MethodMathematical Problems in Engineering
Rayana Jaafar, S. Saab (2021)
Approximate Differentiator With Varying Bandwidth for Control Tracking ApplicationsIEEE Control Systems Letters, 5
M. Quigley (2009)
ROS: an open-source Robot Operating System
S. Saab, Paul Ghanem (2018)
A Multivariable Stochastic Tracking Controller for Robot Manipulators Without Joint VelocitiesIEEE Transactions on Automatic Control, 63
A. Morales, P. Sanz, A. Pobil, A. Fagg (2006)
Vision-based three-finger grasp synthesis constrained by hand geometryRobotics Auton. Syst., 54
Mohamad Orabi, Joe Khalife, Ali Abdallah, Z. Kassas, S. Saab (2020)
A Machine Learning Approach for GPS Code Phase Estimation in Multipath Environments2020 IEEE/ION Position, Location and Navigation Symposium (PLANS)
Camilla Follini, V. Magnago, Kilian Freitag, Michael Terzer, C. Marcher, M. Riedl, Andrea Giusti, D. Matt (2020)
BIM-Integrated Collaborative Robotics for Application in Building Construction and MaintenanceRobotics, 10
G. Konidaris, S. Kuindersma, R. Grupen, A. Barto (2012)
Robot learning from demonstration by constructing skill treesThe International Journal of Robotics Research, 31
Hindawi Journal of Robotics Volume 2022, Article ID 2585656, 14 pages https://doi.org/10.1155/2022/2585656 Research Article Autonomous Robotic Manipulation: Real-Time, Deep-Learning Approach for Grasping of Unknown Objects 1 2 1 Malak H. Sayour , Sharbel E. Kozhaya , and Samer S. Saab Department of Electrical Mechatronics and Computer Engineering, Lebanese American University, Beirut, Lebanon Henry Samueli School of Engineering, University of California, Irvine, USA Correspondence should be addressed to Samer S. Saab; ssaab@lau.edu.lb Received 24 January 2022; Revised 15 April 2022; Accepted 17 May 2022; Published 30 June 2022 Academic Editor: Weitian Wang Copyright © 2022 Malak H. Sayour et al. �is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Recent advancement in vision-based robotics and deep-learning techniques has enabled the use of intelligent systems in a wider range of applications requiring object manipulation. Finding a robust solution for object grasping and autonomous manipulation became the focus of many engineers and is still one of the most demanding problems in modern robotics. �is paper presents a full grasping pipeline proposing a real-time data-driven deep-learning approach for robotic grasping of unknown objects using MATLAB and convolutional neural networks. �e proposed approach employs RGB-D image data acquired from an eye-in-hand camera centering the object of interest in the Šeld of view using visual servoing. Our approach aims at reducing propagation errors and eliminating the need for complex hand tracking algorithm, image segmentation, or 3D reconstruction. �e proposed approach is able to e‘ciently generate reliable multi-view object grasps regardless of the geometric complexity and physical properties of the object in question. �e proposed system architecture enables simple and e’ective path generation and a real-time tracking control. In addition, our system is modular, reliable, and accurate in both end e’ector path generation and control. We experimentally justify the e‘cacy and e’ectiveness of our overall system on the Barrett Whole Arm Manipulator. designing hand-engineered features for the detection of 1. Introduction grasps. �e performance is then evaluated according to Recent advancements in robotics and autonomous systems physical models [6] such as the ability to resist external have resulted in an increase in autonomous capabilities and wrenches [7] or the ability to constrain the object’s motion the use of more intelligent machines in a wider range of [6, 8]. Such hand-engineered function could reach a very applications [1–4]. Gripping various geometric objects is a complex state which implies lots of generalization and key competency for robots to achieve more general-purpose simpliŠcation of the kinetic and physical problem. �e work functionality [5]. �e perceptual abilities of such general- in [9] describes an automated grasping system for 2.5D purpose robots might be used to visually recognize robotic objects. �e approach relies on capturing the object surface end e’ector conŠgurations to safely grab and lift an object curvature with a focus on concave and convex edges to without slipping [5]. Multiple techniques have been pro- achieve 2D immobilization using numerical computation posed to generate e’ective grasps for everyday robotic ap- and hand-engineered equations. �e optimal grasp is then plications. Nevertheless, robotic grasping still provides many generated and extended to 2.5D by performing tactile challenges for researchers and is still among the most de- probing. �e work in [10] proposes a grasping algorithm for manding problems in modern robotics. 3 Šnger grippers. �eir grasp synthesis approach relies on �e various approaches of robotic grasping can be di- object curvature by mean of a curvature function. �e vided into two categories, analytical and data-driven [5]. generated grasps are then Šltered to achieve maximum Most of the e’ort of traditional analytical approaches lies in stability using multiple kinematic criteria such as the force- 2 Journal of Robotics appealing for heavy industrial applications [37]. Real-time closure criterion and ﬁnger adaptation criterion. Analytical methods may not be the ideal approach for the exploration control is a critical and essential feature for operation of autonomous industrial manipulators [38]. Multiple of previously unknown scenes since they restrict the system to situations predicted by the programmer [5]. In addition, techniques have emerged to enable real-time control of such methods tend to heavily rely on the geometric features autonomous systems [39, 40]. To enable ﬂexible and easy of a speciﬁc object, which might be corrupted with partial control of industrial manipulators, suppliers like universal and noisy information from image and depth sensors [11]. robot or Barrett often include external controllers or On the other hand, empirical methods provide an increased internal PCs. )ose external controllers however increase cognitive and adaptive capacity to the robots, while elimi- the cost of robot manipulators and make it even harder for companies to adopt autonomous solution for their desired nating the need to manually model a robotic solution [12]. Some techniques associate good grasps points with an oﬄine applications. Implementing easy ﬂexible and modular real-time controllers could help engineers build real-time database [11] reducing the challenge to object recognition and pose estimation during online execution. However, this solutions and decrease the cost autonomous systems. Industrial companies such as Ocado technologies, BMW approach is unsuitable for the exploration of unknown scenes and is unable to generalize to new objects. group, and Amazon are looking for ﬁnding simple and Deep leaning techniques were proven eﬃcient in eﬀective solutions that can lead to accurate grasp generation solving multiple automation problems [13–15] in a variety of unknown objects with low computation eﬀorts [41, 42]. of ﬁelds including medicine [16–18], security [19, 20], )e latter is considered as the primary motivation behind management [21–24], and Internet of things [25]. With our work. We propose a simple, eﬀective, and complete the proliferation of vision-based deep-learning techniques grasp pipeline for practical applications, a pipeline that is easy to implement by novice engineers, easy to handle by [26–29], signiﬁcant advancements have been made in the generation of potential grasps. Multiple deep-learning average computers, and performs well in noisy industrial environment. We present a full real-time grasping pipeline approaches have been studied over the years; see, e.g., [5] and the references therein. Some of the methods rely on with real-time MATLAB Simulink controller and data- driven deep-learning approach for robotic grasping of un- deploying static cameras for object detection and pose estimation [30], and the other class makes use of eye-in- known objects using generative grasps Deep Convolutional hand cameras mounted on the robot manipulator [31]. Neural Networks (CNN). Although static cameras can achieve eﬃcient task per- We make the following contributions: formance, they require complex hand tracking algo- (i) Computer Vision: the proposed computer vision rithms, 3D reconstruction, and image segmentation for approach integrates multiple simple and eﬀective accurate estimation of object pose and orientation [30]. As techniques for grasp generation. In addition, relying for the control and decision-making problem, multiple on eye-in-hand camera and visual servoing, we are architectures have been proposed to achieve autonomous able to reduce the error propagation while elimi- manipulation [30–32]. )e work in [30] proposes the use nating the need of complex hand tracking algo- of a logical structuring of tasks driven by a behavior tree rithms. Detecting and centralizing the object in architecture. )is allows a reduction in execution time question helped in capturing the object features and while integrating advanced algorithms for autonomous led to more accurate grasps. manipulation. However, behavior trees are found to be (ii) Real-Time control: this paper proposes a real-time computationally heavy due to the involved ticking pro- controller using MATLAB Simulink real time for cedure [33]. For complex movement tasks, the reader is autonomous manipulation. )e proposed real-time referred to the work in [34]. )is study proposes a co- controller eliminates the need of internal PC or the variance matrix adaptation-evolution strategies to ensure need of external controllers. Our proposed ap- the safety and robustness of the robot’s movements in a proach involves a simple block diagram, highly noisy environment. )e approach takes advantage of the ﬂexible and more intuitive for the user. dynamic movement primitives method to represent the robot’s reference trajectories in joint space that are used (iii) Flexible grasp generation: since CNN limits the for trajectory tracking. A vision system is employed to get grasp generation to the normal of the plane of view, the coordinates of the target object. )e eﬀectiveness of we propose a ﬂexible multi-view grasp generation the entire system is tested on a 4-DOF Barrett WAM while eliminating such limitations. )e generated robotic arm. )e Robot Operating System (ROS) on the grasp is therefore not restricted to a single plane but other hand is also widely used for both the control rather multiple angles of view capturing more problem and interprocess communication [12]. ROS is an features. open-source robotic middleware platform for large-scale )e structure of the paper is as follows. Section 2 robotic system development. )e work in [35] provides a presents the proposed system as well as an overview of the structured communication layer on top of the host op- methods used for autonomous grasp generation and ma- erating system for heterogeneous cluster computing [36]. nipulation. Section 3 includes our experimental results. While ROS is proven eﬃcient in robotics applications, Finally, the paper comes to a conclusion and motivates real-time capabilities remain a serious issue [37]. ROS further challenges. does not support real-time applications and may not be Journal of Robotics 3 graspable object is needed. Coded in Python, we implement 2. Methodology a task planner that uses PyTorch and OpenCV to detect and In this section, we describe each part of our system archi- locate the object; we apply visual servoing and generate ﬁrm tecture illustrated in Figure 1. In addition, we present our grasps based on RGB-D images of the object in question. proposed grasp planning system that deals with object de- More details about the planner are provided in the following tection, object centralization, and grasp generation algo- section. Once the desired end eﬀector is generated, the rithm; path planning that generates the desired trajectories output is sent to the real-time controller for execution. of the robot joint positions; and the associated tracking controller. 2.2. Grasp Planning. Generating a gripper conﬁguration that maximizes a success metric consists of the main objective of 2.1. System Architecture grasp planning. Taking into consideration numerous object shapes and all possible orientation variations in the work- 2.1.1. Hardware. For autonomous grasping, the Barrett space, the planner aims at providing the correct grasp Whole Arm Manipulator (WAM) 7-DOF robotics arm is representation that leads to successful object gripping. employed. )e WAM arm has direct-drive capability sup- Robust grasp planning considers the presence of pertur- ported by Transparent Dynamics between the motors and bation in the object properties such as shape and poses joints providing a robust control of contact forces. Mounted caused by noisy and partial sensor readings as well as control on the Barrett manipulator is the BH8-series Barrett Hand, a imprecision. To ensure successful and accurate grasp highly ﬂexible eight-axis gripper with three ﬁngers. In our planning, three key tasks are deﬁned: (1) object detection, (2) work, the third ﬁnger is removed to simulate a parallel plate object centralization, and (3) grasp generation. gripper eﬀect. For RGB-D image acquisition, the Microsoft Azure Kinect RGB-D sensor is mounted on the wrist to achieve dynamic sensing of the environment and objects. 2.2.1. Object Detection. Autonomous manipulation requires )e Azure Kinect integrates a Microsoft-designed 1- localization of the desired graspable objects in 3D space. )is Megapixel Time-of-Flight (ToF) depth camera using the information is essential for motion planning since it sets the image sensor presented at ISSCC 2018. )e depth sensor goal location for the manipulator end eﬀector. For per- supports the following ﬁve modes: NFOV unbinned, ceiving objects in the vicinity of a robot, 2D vision data such NFOV 2 × 2 binned, WFOV 2 × 2 binned, WFOV unbinned, as edge information are used to localize the object. A 3 × 3 and passive IR. kernel horizontal and vertical edge enhancement is imple- mented. )e resultant image tends to be noisy which ne- cessitates the use of noise reduction where in our case we 2.1.2. Real-Time Controller and Communication. We set the employ erosion kernels. A 3 × 3 kernel is subsequently ap- Barrett WAM control loop frequency at 500 Hz. )at is, at plied to reduce noise and distortion in the image followed by each 2 ms interval, a signal is sent to the joint encoders a 3 × 3 dilation kernel to restore the previous edge scale. (puck) for the arm to function properly; otherwise, this Following this step, bounding boxes are generated around might lead to a heartbeat fault where the manipulator would the available objects each with a coupled ID. After specifying stop operating. It is important to note that along the ex- the desired Object ID, the generated Bounding Box is locked periment the internal PC of the Barrett WAM was not used. reducing any eﬀect of the external environment noise and/or In addition, for the closed-loop control, current joint angle perturbation. Figure 2 illustrates the steps involved in our acquisition is necessary as well as the desired end eﬀector object detection approach. pose. )is calls for a real-time controller that will ensure stable communication in a precise time window. For that purpose, MATLAB Simulink real-time is selected for robot 2.2.2. Object Centralization. )e main cause behind inac- manipulation. To ensure reliable communication between curate grasp generation is in the camera perspective. For the WAM and the MATLAB, a UDP communication is robust grasp generation, localizing the object is not suﬃcient established. In particular, a UDP-receive and UDP-send and uniform object locations in the ﬁeld of view of the block is used. Each joint has a unique ID which helps in camera could be beneﬁcial. Furthermore, most depth identifying the origin of the message sent as well as the cameras possess an optimal working range. However, when destination of the generated control signal. A customized dealing with unknown objects with diﬀerent heights and UDP-to-CAN bus communication converter is imple- shapes, using a stationary camera would not be the best mented allowing for commands send by UDP from the solution. In our system, we install the camera at the end MATLAB controller to be converted to CAN and sent to the eﬀector making the camera dynamic. As the end eﬀector or WAM joints. camera moves towards an object from its initial position, a better shot can be captured depending on the object shape 2.1.3. Task Planner and Computer Vision. Once our real- and camera position. )is approach assists in reducing the errors resulting from multilevel processing and camera time controller is implemented, our manipulator becomes capable of reaching the desired end eﬀector target. At this perspective. Visual servoing, also known as vision-based robot control, is implemented to center the object in the ﬁeld stage, a task planner that generates the desired end poses based on the surrounding environment and the desired of view of the camera. Visual servoing [43] is a technique 4 Journal of Robotics Customized UDP to CAN bus communication converter Real Time Controller Task Planner (i) Path generation (i) Object detection (ii) Inverse Kinematics Kinect RGB-D (ii) Visual Servoing (iii) Gravity Compensation sensor (iii) RGB-D image processing (iv) PID Controller (iv) Grasp generation using CNN Barrett WAM (v) Motor and encoders transformations and Hand Figure 1: System architecture. Bilateral Filter Edge 3×3 Erosion 3×3 Dilation Dispaly Enhancement Figure 2: Object detection steps. postprocessing techniques and present our CNN for grasp that uses feedback information extracted from a vision sensor to control the motion of a robot [44]. )is technique generation. (see Table 1) minimizes the camera distortion eﬀects and ensures that the object is fully contained in the ﬁeld of view. )ree visual (1) Preprocessing. Irrespective of the available depth-sensing servoing techniques are discussed in the literature; see, e.g., techniques such as Stereo Reconstruction, Infrared, Time of [45], in particular, position-based visual servo (PBVS) and Flight, and LiDAR, the depth information is always bounded image-based visual servo (IBVS) and hybrid visual servo. In by a minimum and maximum range, beyond which the PBVS, vision data is used to reconstruct the 3D pose of the depth information is clipped. Adding to this, light reﬂection, manipulator and a kinematic error is generated and mapped refraction, diﬀusion, and shadows all contribute to lowering to the robot actuator commands. In IBVS, the position error the eﬃciency of the depth sensor and producing invalid is generated directly from the image plan features. Fur- spots in the depth image. To mitigate the eﬀect of invalid data in our depth image that produces invalid grasps, thermore, there exist two architectures for visual servo in- tegration in the control loop. )e ﬁrst architecture is known prepossessing of the depth map is crucial before further data manipulation. To do so, we propose a technique that detects as dynamic look and move, which relies on external input provided by the vision system to the joint closed-loop invalid pixels in the depth image and reconstructs the invalid control of the manipulator. )e second architecture is data. )e proposed preprocessing method is detailed in known as direct visual servo, which uses the vision system Algorithm 1 and illustrated in Figure 3. directly in the control loop. In our work, we use an image- based visual servo in a dynamic look and go conﬁguration. (2) Generative Grasp Convolutional Neural Network. As )e bounding box location is used as our visual feedback and mentioned in the introduction section, the aim of our work the arm is kept moving until the calibrated camera center-line is to implement a full, simple, and eﬀective grasp pipeline to is aligned with the object location, and until the distance from enable grasping of unknown objects that can be easily adopted in the industry. )us, one of the main factors to be the camera to the object’s uppermost plane is equal to an “optimal” depth value for grasp generation. Consequently, the considered is the speed and size of the network. Many acquired depth image is used for grasp generation. networks have been implemented in the ﬁeld to enable accurate and precise grasp generation such as the network presented in [48]. For our pipeline, we use a fully con- 2.2.3. Deep-Learning-Based Grasp Generation. In order to volutional neural network for grasp generation that is generate eﬃcient grasps for diverse geometric shapes, the proposed in [31], which is a generative grasp convolutional employed deep-learning approach is presented. To ensure neural network. )e main advantage behind this network ﬁrm grasping with an eﬀective approach in capturing the relates to its small number of parameters. It consists of six geometric key-points of the objects, we use preprocessing convolutional layers (see Figure 4) with a total of 62,420 techniques targeting edge enhancement, and postprocessing parameters compared to 8,806,986 parameters in the net- techniques for feature matching. In what follows, we de- work proposed by [31]. )e relatively small number of scribe the steps employed in our preprocessing and parameters yields less computation eﬀorts that can be easily Journal of Robotics 5 Table 1: What our work has to oﬀer. A comparison between our approach and the related work in the ﬁeld. Feature [46] [11] [30] [31] [47] Our approach Unknown objects No Yes No Yes Yes Yes Object feature matching NA NA NA No Yes Yes Eye-in-hand camera and object centering No No No Yes No Yes Grasp generation complexity DL + analytical DL 3-stage DL Analytical DL Training data availability No No NA Yes NA Yes Real robot experiments No Yes Yes Yes Yes Yes Multi-view grasp generation No No No No No Yes Robot control No No Yes No No Yes Real-time control without arm controller No No No No No Yes Depth image with (i) 2D gradient (iv) Flood filling Depth inpainting invalid (ii) Thresholding (iv) Dilation and smoothing spots (Black) (iii) Creating mask (OpenCV method) Figure 3: Depth image preprocessing steps. (1) Capture image (with likelihood of invalid depth data) (2) Generate 2D gradient of the depth map (3) )reshold the gradient to produce a binary mask (4) Flood-ﬁll the mask. At this stage a mask that indicates all the invalid spots in the depth image is generated (5) Dilate generated mask with a 3 × 3 kernel (6) Feed the generated mask to OpenCV inpainting function that attempts to deduce and reconstructs the invalid data from the pixels’ value in its neighborhood ALGORITHM 1: Preprocessing algorithm. 300×300 inpainted depth image pi/2 0 θ -pi/2 300×300 pixels Figure 4: Description of the neural network architecture that is implemented in our study. )e generative grasp CNN takes as an input an inpainted depth image, and generates a grasp pose for every pixel, comprising the grasp quality, Q , grasp width, W , and grasp angle, φ . Θ Θ Θ conv 9×9 32 Filters, Stride 3 conv 5×5 16 Filters, Stride 2 conv 3×3 8 Filters, Stride 2 conv Transpose 3×3 8 Filters, Stride 2 conv Transpose 5×5 16 Filters, Stride 2 conv Transpose 9×9 32 Filters, Stride 3 6 Journal of Robotics handled by our employed network compared to other more eliminate all false-positives and would only keep the valid evolved CNN used for the same purpose. grasps for consideration. On the other hand, in most cases, the width of the line )ere are two main advantages of the proposed GG- CNN over other state-of-the-art grasp synthesis CNNs. estimated by the neural network does not match the exact First, rather than sampling grasp candidates, we directly width of the object’s feature. A discrepancy between the real create grasp poses on a pixel-by-pixel basis, similar to feature width and the manipulator opening width can lead to improvements in object detection, where fully con- undesirable results and failure of the grasping attempt. To volutional networks are frequently utilized to provide mitigate the eﬀect of this mismatch, we propose a post- pixel-wise semantic segmentation rather than sliding processing method presented in Algorithm 2 and illustrated windows or bounding boxes [49]. Second, unlike other in Figure 5. CNNs used for grasp synthesis, our GG-CNN has orders Adding to the above, the neural network can detect of magnitude fewer parameters, allowing for fast closed- various potential grasps for a unique object. To eliminate loop grasping. Our grasp detection pipeline can be ex- further chances of errors and confusion, ten trials are speciﬁed. Similar grasps are aggregated based on the ecuted in only 19 ms on a GPU-equipped desktop computer. output of the trials. After querying the trial, the ma- )e neural network proposed approximates the complex nipulator considers the most recurring grasp as the ﬁnal �����→ function M: I ⟶ G, by training the network. M(I) can be potential grasp that is ready for application. Once the learned by using a set of input images IT and corresponding ideal grasp is generated, there is no need to keep the output grasps GT. )e network equation can be expressed by object in the center of view. )e next challenge lies in applying the grasp by accurately controlling the robotic M (I) � Q ; φ ; W . (1) manipulator. Θ Θ Θ Θ )e grasp map G estimates the parameters of a set of grasps, for each Cartesian point in the 3D space corre- 2.3. Path Planning sponding to each pixel in the captured image. It constitutes a 2.3.1. Inverse Kinematics. )e motion planning component set of 3 images denoted as, Q, φ, and W. )e representations are as follows: of the system is responsible for computing trajectories for the arm that will place the gripper ﬁngers at or close to M (I) denotes the neural network with Θ being the weights of the network and I representing the 300 × 300 the generated grasp conﬁguration. To complete this task, a mapping between the world and joint coordinates is pixel RGB-D input image. Q describes the grasp quality. Its scalar value ranges within [0,1] with 0 indicating the required. We approach this problem by using inverse ki- lowest quality while one the highest grasp quality. φ nematics. Inverse kinematics (IK) is a technique that maps denotes the grasp angle where the angles belong to the end eﬀector position to the corresponding joint angles. [(−π/2), (π/2)]. W describes the grasp width of a grasp to Given a desired end eﬀector pose, the IK solver generates the be executed. joint angles needed to move the arm to the desired pose )e network is trained over the Cornell Grasping Dataset avoiding collisions with itself. To compute the correct joint positions that leads to a certain six-dimensional Cartesian which contains 885 RGB-D images of everyday objects, with a variable number of human-labeled positive and negative pose, detailed knowledge of the geometry and conﬁguration of the robot is essential. )e SolidWorks Computer Aided grasps. )e network is trained for 32 epochs over 8 hours on an NVIDIA Quadro P100 GPU. Design (CAD) model provided by Barrett oﬀers detailed information about the joint conﬁguration of the robot. To (3). Postprocessing. )e grasp representation generated by transfer this information into our controller, a Uniﬁed the GG-CNN output consists of a line position coupled with Robot Description (URDF) model is generated. )e URDF a speciﬁc width and angle. )e line position speciﬁes where model takes advantage of the geometric data provided by to look for potential grasp in the depth image, while the SolidWorks and generates an XML ﬁle that contains detailed information associated with each link such as the joints angle provides us with the information about the rotation of the manipulator end eﬀector with respect to its normal. After limits and link length. )e latter parameters are listed in Table 2. experimenting with the neural network performance, a considerable amount of false-positive potential grasps has To apply inverse kinematics (IK), the Inverse Kinematics block provided by Simulink is used. )e IK solver provided been observed. )is can be brought back to the noise in- troduced by the background of the object (i.e., uneven by MATLAB allows for accurate joint angle computation without the need for in-depth knowledge of the robot surface of the table that translates into a small variation in the depth image and triggers the GG-CNN to detect an geometry and joint conﬁguration. It also eliminates the invalid grasp) or sensor readings. For this purpose, we need of setting Denavit–Hartenberg (DH) frames trans- implement a grasp validation condition that checks the formation necessary for generation of the robot inverse amount of depth variation in the potential grasp in order to kinematics model. )e IK block requires a stored con- ﬁguration ﬁle and three main inputs. )e conﬁguration increase the chances of successful grasps. In particular, if depth is below a certain threshold, the grasp generated is ﬁle allows for speciﬁc equation generation associated with the described robot geometry. )e generated URDF model disregarded and not accounted for. )is approach would Journal of Robotics 7 Unmatched width Matched width 1.0 0.5 0.0 –0.5 –1.0 0 20 40 60 80 100 120 Pixels Figure 5: Depth image postprocessing: matching the generated grasp width to the object feature. )e blue line represents the grasp generated by the GG-CNN. )e purple line represents the processed grasp generated matching the grasp line to the object width. (1) Plot the depth versus image pixels from the starting point of the line to its end. th (2) Apply a low pass ﬁlter to the data. In this paper, a 5 -order Butter-worth ﬁlter is selected chosen due to its ﬂat proﬁle across the bandwidth of interest (3) Take the gradient of the low pass ﬁltered depth data. (4) Find all the local minima and maxima above a certain threshold. )is is essential due to the fact that some small irregularities in object’s surface can induce a local minima or maxima. (5) Select the ﬁrst minimum and last maximum and compute the grasp width from the distance between the generated points. ALGORITHM 2: Postprocessing algorithm. Table 2: Barrett WAM joint limits and link lengths. Joint Lower limit (rad) Upper limit (rad) Link Length (mm) J1 −2.6 2.6 Base link 186 J2 −2.0 2.0 L1 550 J3 −2.8 2.8 L2 300 J4 −0.9 3.1 L3 60 J5 −4.76 1.24 J6 −1.6 1.6 J7 −3.0 3.0 Depth 8 Journal of Robotics conﬁgurations can lead to the same end eﬀector pose. is used as a conﬁguration ﬁle. )e proposed method makes the approach highly ﬂexible. Changing the entire ma- Inconsistency of joint conﬁgurations or non-opti- mal path generation can be controversial. )e work nipulator geometry or even using another manipulator can be simply done by replacing the conﬁguration ﬁle in [50] proposes an incremental inverse kinemat- stored in the IK block by the corresponding arm model, ics-based vision servo approach for robotic ma- eliminating the need for link parameter adjustment or axis nipulators to capture non-cooperative objects reconﬁguration. autonomously. )e proposed approach generates )e ﬁrst input speciﬁes the current joint angle conﬁg- the joint angles iteratively at each time stamp based uration of the robot. )e second input allows us to control on the object location detected by a stationary the error tolerance of the end eﬀector position. )at is, if the camera. )e approach assumes a ﬁxed-camera- end eﬀector is within the speciﬁed margin, the pose is frame to base-frame transformation and uses the actual robot pose as a starting point to generate considered as attained. )is input is presented with a list of six variables: δx, δy, δz represent the error tolerance of the appropriate joint angles. Inspired by this technique, we use the current pose of the robot manipulator as Cartesian translation components of the end eﬀector, and δΨ, δθ, δϕ represent the error tolerance of the Euler ro- an initial guess in the provided IK block. )erefore, tation components of the end eﬀector. )e Euler angle conﬁgurations closer to the actual robot state are system is a method to describe the coordinate transfor- favored over others. However, since the base joint mations. It consists of three independent variables, ψ θ, ϕ of the Barrett is limited to −2.6 and 2.6 radiant as deﬁned as successive planar rotation angles around the x, y, shown in Table 2, choosing the closest joint conﬁg- and z axes, respectively. )e latter values describe the end uration as a starting point can result in longer path leading to non-optimal path generation. To resolve eﬀector pose. High tolerance in the translational x-axis means that a relatively high error in the x translation this issue, we propose the use of the two inverse ki- nematics modules. One uses the current joints as component of the gripper is acceptable. Similarly, a relatively high tolerance in the rotational z-axis means that a high initial guess and the other has no initial guess. )e two outputs are fed into an error function. Based on the error in the z orientation component is acceptable. )is helps the IK solver in deﬁning the most critical position output of this function, the conﬁguration with smaller components and plan the path accordingly. error is used. )e third and ﬁnal input deals with the initial guess. )e proposed approach is simple and comes with low Inverse kinematics algorithms do not provide one unique computational eﬀorts. )e eﬃcacy of our proposed so- joint conﬁguration. Multiple joint angles, as well as multiple lution is justiﬁed experimentally. It also generates the paths, can lead to the same end-eﬀector pose in the 3D space. joint angles within ﬁve milliseconds. )is approach Some IK solvers fail to converge sometimes with the eliminates the need for generating adequate initial guess presence of strict time limits and others require heavy or the use of optimization techniques. It is also important computation to plan one path. )e initial guess helps the IK to note that the generated arm motion is not limited to a solver in ﬁnding the best path by providing an ultimate planar motion and can follow any reachable path in the 3D starting point for the algorithm. )is approach reduces space. computation, makes the planner faster, and leads to a more accurate path. )e current joint angles are fed as an initial 2.3.2. Coordinate System. For robotic manipulation, we guess to inform the solver that joint angles closer to the deﬁne two coordinate systems depicted in Figure 6, namely, actual position are desired. )is approach helps with the A and B. We set A, the frame corresponding to the base link elimination of the singularities in the joint conﬁguration that of the Barrett WAM, to be the base frame. We deﬁne B to be rotates the base by 360 degrees, for example, to reach a pose the camera frame placed at arm end eﬀector. To obtain the while zero degrees can do the job. camera frame, the forward kinematics plugin provided by To make sure the joint conﬁguration generated from MATLAB is used. )e camera ﬁxation part is designed in inverse kinematics is reliable, we implement the following SolidWorks and added to the urdf model of the manipulator two-step approach: as a virtual link. )e latter is then used to generate the camera (1) In order to establish our operating space domain, joints frame using forward kinematics and end eﬀector pose. Fol- limits are employed. In particular, before generating lowing this coordinate system, all end eﬀector goal poses are the urdf ﬁle, safe joint limits are added to the CAD deﬁned with respect to the manipulator base. Similarly, all design on SolidWorks and the uniﬁed robot descrip- detected poses and vision output are transformed back from tion model is generated accordingly. )e generated the camera frame to the base frame. )e joint coordinates are URDF is then stored inside the IK block. By following deﬁned with respect to the manipulator home pose, which this approach, all unattainable singularities are auto- implies a fully extended arm in the positive z direction. matically dropped due to violation of the joint limit )erefore, a pose in the joint space is based on seven joint constrains where the space of singularities becomes angles describing the diﬀerence angle between the joint goal conﬁned to the operating space of the robotic arm. and home pose. Mapping from a 6D pose deﬁned in the global (2) It is widely known that Inverse kinematics does not base frame to seven joint values deﬁned in the joint space is generate unique solutions and that multiple joint the main task of the inverse kinematics block. Journal of Robotics 9 Task space x y z X, Y, Z coordinates grasp width grasp Angle End eﬀector Axis‐Angle representation Figure 6: Coordinate system and grasp representation. )e x-axis is represented in red, y-axis in green, and the z-axis in blue. Joint 1 Joint 2 Joint 3 Joint 4 Joint 5 Joint 6 Joint 7 Torques Figure 7: )e ﬁrst seven subplots illustrate the tracking performance of the employed controller corresponding to the seven joints. Each of the seven plots includes the reference trajectory of the joint position versus its actual position. )e eighth subplot shows the relatively smooth torques that are associated with the seven joints. Figure 8: Modeling of ﬁnger closing path. 2.3.3. Safe Motion. Once the joint positions are calculated by exceeds TL2, a torque fault is raised and the arm stops. To the IK block and before feeding it to the position controller, avoid any velocity or torque fault, a velocity limiter block is certain conditions have to be applied to the velocity to secure added. )e speciﬁed block can slow down the velocity while safe control. In Barrett safety system, VL2 is deﬁned as the keeping the control operation safe. )is step fortiﬁes a safe safety system’s motor velocity fault limit while TL2 repre- operation but can introduce jerky movements. To ensure a sents the safety system’s motor torque fault level. Velocities smoothed path, a spline function is used. )is function higher than VL2 cause a velocity fault and stop the entire smooths out the generated path while securing zero initial arm motion in a E-stop eﬀect. Similarly, if any joint torque velocity and acceleration. 10 Journal of Robotics 2.4. Controller. )e nonlinear dynamics model describing order ﬁlter at the PID inputs to reduce the measurement the continuous-time dynamics of a rigid n-link serial robot noise. manipulator, with all actuated revolute joints described in Before sending the torques to the arm, appropriate joint coordinates, can be represented by the Euler-Lagrange transformations are applied to transform the joint torques to model as follows motor torques and then to motor current. For that, we use the following: u � M(q)q € + C(q, q _)q _ + b(q _) + g(q), (2) M � VJ , (3) τ τ i i n n where q ∈ R represents joint angles, q _ ∈ R represents where i is the joint number and V is the joint to motor n×n joint velocities, M(q) ∈ R is a symmetric positive torque transformation matrix provided by Barrett and n×n deﬁnite inertia matrix, C(q, q _) ∈ R is the Coriolis and deﬁned as centripetal matrix, b(q _) ∈ R represents the joint friction, n n g(q) ∈ R is the gravity vector, and u ∈ R stands for the ⎧ ⎪ ⎫ ⎪ ⎪ ⎪ − 0 0 0 0 0 0 ⎪ ⎪ ⎪ ⎪ torque input. ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ Once the desired path is generated, the required torques ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ needed to move the end eﬀector to the desired pose must be ⎪ ⎪ ⎪ ⎪ ⎪ 1 n ⎪ ⎪ 3 ⎪ automatically generated. Many diﬀerent discrete-time ⎪ ⎪ 0 − 0 0 0 0 ⎪ ⎪ ⎪ 2N 2N ⎪ multivariable controllers have been proposed that can be 2 2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ applied to our problem such as for the deterministic setting ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ [51, 52] and the references therein. However, the inevitable ⎪ ⎪ 1 n ⎪ ⎪ ⎪ ⎪ noise associated with the measurement of the joint angles 0 − − 0 0 0 0 ⎪ ⎪ ⎪ ⎪ ⎪ 2N 2N ⎪ 2 2 ⎪ ⎪ yields poor performance and possible vibrations in the arm. ⎪ ⎪ ⎪ ⎪ )ere are also multivariable iterative-type controllers with ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ the capability of suppressing measurement noise such as the ⎨ ⎬ 0 0 0 − 0 0 0 , (4) ones proposed in [29, 53, 54]; however, accurate tracking ⎪ ⎪ ⎪ N ⎪ 4 ⎪ ⎪ ⎪ may require several iterations, and the learning would be ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ based on a speciﬁc trajectory. A common class of closed- ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ loop controllers employs slide-mode approach; see, e.g., ⎪ ⎪ 1 n ⎪ 6 0 0 0 0 − 0 ⎪ ⎪ [55, 56] and the references therein. )ere are also several ⎪ ⎪ ⎪ 2N 2N ⎪ ⎪ 5 5 ⎪ ⎪ ⎪ optimal stochastic simultaneous multivariable controllers ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ such a PD-type [57], and ones with gravity compensation ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 1 n [58], PID-type [59–61], and even PDD [62]. Although the ⎪ ⎪ ⎪ ⎪ 0 0 0 0 0 ⎪ ⎪ ⎪ ⎪ works in [55–57, 62] include experimental justiﬁcation on ⎪ 2N 2N ⎪ 5 5 ⎪ ⎪ ⎪ ⎪ the Barrett WAM, they do not deal with controlling its ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ gripper. All of the aforementioned strategies and many ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ others come with a number of unique features; however, ⎪ 0 0 0 0 0 0 − ⎭ they require deep comprehension of control theory and/or some knowledge of the statistical models. In this paper, we and N and n are constants provided by Barrett. i i use simple single-input single-output PID controller that Following this step, we multiply the output by the can be designed by any novice engineer. In particular, we motor-toque-to-motor-current ratio provided by Barrett tackle the torque control problem by introducing closed- and ﬁnally send the signal to the arm pucks for motion. loop PID control with gravity compensation. Gravity )e tracking performance of the seven joints and corre- compensation is a well-known technique in robot design, sponding torques are shown in Figure 7. It is important to which compensates for the force generated by the gravity. note that although the Barrett WAM comes with a force- )is technique requires the knowledge of the gravity model. torque sensor at the wrist, we do not use the sensor in our We use the corresponding model parameters provided by controller. Barrett. To account for gravity forces, we set B(q) to be the inverse of the inertia matrix, which is a symmetric and − 1 nonsingular. )at is, B(q) � M (q). )us, the gravity compensation term A(q) is the product of the inverse inertia 2.4.1. Gripper Control. For the control of the Barrett WAM matrix and the gravity force vector, gripper, no closed-loop control is needed to be implemented − 1 A(q) � −M (q)g(q) � −B(q)g(q). )is term is added to and no parameters are needed to be set or tuned. )e gripper the control vector. )e gravity compensation is imple- simply accepts ﬁnger commands in a speciﬁc CAN format mented in a MATLAB Simulink block and added to the provided by Barrett that allows to move the gripper ﬁngers to controller. For closed-loop control, seven independent PID the desired pose as well as open and close commands for controllers are implemented and manually tuned for each grasping and disposing objects. Since the closing proﬁle of joint. To tune the PID, a sine wave is used as a ﬁctitious the gripper is curved rather than being a straight line as reference trajectory of the joints, and the controller pa- shown in Figure 8, modeling of the gripper closing path rameters are adjusted accordingly. We also deploy a ﬁrst- needs to be done in order to link the opening width to the Journal of Robotics 11 Table 3: Grasp performance. Item Box Tape Piller Tube Eraser Open box Number of trials 20 25 15 10 20 5 Success rate (%) 90 92 93 80 95 100 Figure 9: Grasp generation results: comparison between grasp generated by the GG-CNN with and without RGB-D image preprocessing for shiny and black objects. )e upper 2 rows show the generated grasp output without preprocessing of the RGB-D data. Grasps generated are invalid and the depth info is corrupted. )e lower 2 rows show the generated grasp output after preprocessing. )e depth data was reconstructed leading to valid grasps. boxes, tape, tubes, and other objects. )e proposed approach ﬁnger height and avoid any collision with the table. Fol- lowing this step, we use the postprocessed width generated is shown to be eﬃcient with high-success rates as docu- in Subsection B as a pre-grasping ﬁnger opening. )en, only mented in Table 3 even when the objects are shiny and when the gripper reaches the ideal grasping pose, is the associated with a defective depth map. )e used technique closing signal sent. Since the Barrett gripper is equipped with helps in the depth map reconstruction leading to accurate a force-torque sensor, determining the suﬃcient force to grasp generations. Figure 9 shows the neural network output hold the object in question is not needed. )e gripper ﬁngers grasp before and after applying the pre- and postprocessing stop automatically whenever enough force is applied to techniques. stabilize the object in question. As for the multiple view grasp generation, Figure 9 il- lustrates the results of the aforementioned techniques in ﬁltering disturbances from the external environment. )e 3. Experimental Results grasp generation is successful even on transparent or light reﬂecting objects. )is technique enables capturing the full To assess the performance of our proposed system, a series of object feature in a simple and eﬀective way without the need experiments are conducted. Multiple unknown objects with for 3D reconstruction or point cloud generation which could diﬀerent geometric representations are used to test the be challenging in noisy industrial environments. system performance. In our evaluation, we focus on 2 cri- teria: the grasp generation and the control of the manipu- lator. In the ﬁrst series of experiments, we evaluate the 3.2. Overall Performance. Coupled with a reliable control of performance of our approach for grasp generation. In the the Barrett manipulator, successful grasp generation led to second series, the grasp generation is integrated into the successful grasps of multiple objects regardless of their complete grasping pipeline to assess the torque control and geometric shapes or orientation. Visual servoing perfor- path planning. mance is proven eﬃcient in limiting grasp application error resulting from camera perspective. )e implemented algo- 3.1. Grasp Generation. )e eﬃciency of the approach fol- rithm also centers the object of interest in the ﬁeld of view lowed for grasp generation is assessed using various pre- within the desired time frame. Our controller yields smooth, viously unknown objects with complex geometric relatively fast path execution driving the end eﬀector to the representations. )e objects vary between multiple me- desired pose with millimeters accuracy, and matching of the chanical tools, transparent and nontransparent bottles, end eﬀector contact point with the generated grasp, which With pre-processing Without pre-processing Depth RGB image Depth RGB image image with grasp image with grasp 12 Journal of Robotics gripper types. In addition, since convolutional neural made grasping of complex shapes possible. Table 3 shows the success rate of our proposed automated grasping approach. network limits the grasp to the normal of the plane of view, having an eye-in-hand camera allowed for multi- A video describing the overall project performance can be displayed by visiting https://www.youtube.com/watch? view grasp generation. )is technique enables the gen- v�gL5oefp_QKoVideo. eration of grasps in multiple plans covering multiple features and leading to a better grasp generation for complex shapes. 4. Conclusion and Future Work )e multi-view grasp generation was implemented We presentenced a data-driven deep-learning approach to from the vision and grasp generation perspective; how- robotic grasping of unknown objects with eye-in-hand ever, our solution has not employed kinematics. Tech- camera and real-time controller. Our paper has tackled the niques using kinematics would allow approaching the object from various angles, which could signiﬁcantly autonomous grasping problem from start to ﬁnish covering WAM real-time control, data manipulation, grasp genera- increase the grasping accuracy and the object immobility. Furthermore, dynamic collision avoidance implementa- tion, and execution. Our proposed approach is suitable for the exploration of unknown scenes and generation of ef- tion could be of great beneﬁt. Implementing such tech- fective grasps for various geometric shapes and unknown niques might result in a more robust system suitable for objects without the need for complex hand tracking algo- human-robot application by increasing the safety level rithm, image segmentation, or 3D reconstruction. )anks to and ﬂexibility of the system. the unique well-deﬁned decision-making tree and the in- tegration of diﬀerent techniques, we were able to eliminate Data Availability the error propagation resulting from multilevel processing, camera perspective. and non-ideal depth range. Starting )e justiﬁcation of this work is experimental and no speciﬁc with visual servoing, we centered the object in question in data are available. However, we have a link to a video in the the ﬁeld of view reducing perceptual error and eliminating manuscript that illustrates the overall performance that can error propagation. Following this step, we implemented be displayed by visiting https://www.youtube.com/watch? high level prepossessing computer vision-based tech- v=gL5oefp_QKo. niques coupled with edge enhancements and depth data reconstruction method to account for shiny surfaces and Conflicts of Interest light variation making the object localization, grasp generation, and execution more robust. For the grasp No potential conﬂicts of interest are reported by the authors. generation problem, a generative grasp convolutional neural network was implemented providing eﬃcient grasp References conﬁguration for parallel grippers. We topped this by a postprocessing layer that matched the generated width [1] M. Lopes and J. Santos-Victor, “Visual learning by imitation with the object feature gradients. To the best of our with motor representations,” IEEE Transactions on Systems, knowledge, the latter is a functional attribute that has not Man, and Cybernetics. Part B, Cybernetics: a Publication of the been considered in the literature. IEEE Systems, Man, and Cybernetics Society, vol. 35, no. 3, From the robotics and control point of view, we tackled pp. 438–449, 2005. [2] C. Follini, V. Magnago, K. Freitag et al., “Bim-integrated the control of the arm in a simple and eﬀective approach. In collaborative robotics for application in building construction particular, we proposed a simple real-time control imple- and maintenance,” Robotics, vol. 10, no. 1, p. 2, 2021. mentation of the Barrett WAM manipulator using MATLAB [3] B. Abouzeid, G. El Hasbani, J. Abouzeid, S. Halleit, and Simulink real-time module. Unlike the majority of the work K. Jallad, “Cost comparison of gynecologic procedures be- implemented on MATLAB for robotic manipulators, tween the us and a developing country: an observational MATLAB was not only used for simulation or as a tool that study,” Journal of Robotic Surgery, vol. 16, pp. 1–5, 2021. communicates with the arm controller such as the work in [4] E. Fakhoury, P. Culmer, and B. Henson, “)e impact of visual [63, 64]. In fact, we used MATLAB Simulink for the full arm cues on haptic compliance discrimination using a pseudo- control whereas the arm internal PC was left undeployed. By haptic robotic system,” in Proceedings of the 2017 IEEE In- applying our approach to a class of robotic manipulators, ternational Conference on Robotics and Biomimetics (ROBIO), one can eliminate the need of the controller box that takes pp. 1719–1725, IEEE, Macau, Macao, December 2017. [5] S. Caldera, A. Rassau, and D. Chai, “Review of deep learning additional space and increases the cost of the arm. )is could methods in robotic grasp detection,” Multimodal Technologies be substantially beneﬁcial whenever a company desires to and Interaction, vol. 2, no. 3, p. 57, 2018. adapt a robotic solution in limited spaces with lower cost. [6] G. Kootstra, M. Popovic, ´ J. A. Jørgensen et al., “Enabling Furthermore, the proposed approach is not only cost grasping of unknown objects through a synergistic use of edge eﬀective and leads to reduction in space; it is also simple and surface information,” Ae International Journal of Ro- and modular. In our work, the deployed gripper is botics Research, vol. 31, no. 10, pp. 1190–1213, 2012. equipped with a force toque sensor that ensured ﬁrm [7] F. T. Pokorny and D. Kragic, “Classical grasp quality eval- grasps. For future work, a force optimization layer as uation: new algorithms and theory,” in Proceedings of the 2013 proposed in [65] can be added to our system to make the IEEE/RSJ International Conference on Intelligent Robots and approach more modular and functional with diﬀerent Systems, pp. 3493–3500, IEEE, Tokyo, Japan, November 2013. Journal of Robotics 13 [8] A. Rodriguez, M. T. Mason, and S. Ferry, “From caging to scheduling,” Journal of Management in Engineering, vol. 37, grasping,” Ae International Journal of Robotics Research, no. 1, Article ID 04020104, 2021. [23] S. AbdulRahman, H. Tout, A. Mourad, and C. Talhi, vol. 31, no. 7, pp. 886–900, 2012. [9] P. Bender and G. M. Bone, “Automated grasp planning and “FedMCCS: multicriteria client selection model for optimal execution for real-world objects using computer vision and iot federated learning,” IEEE Internet of Aings Journal, vol. 8, tactile probing,” International Journal of Robotics and Au- no. 6, pp. 4723–4735, 2020. [24] B. Tay and A. Mourad, “Intelligent performance-aware ad- tomation, vol. 19, no. 1, pp. 15–27, 2004. [10] A. Morales, P. J. Sanz, A. P. Del Pobil, and A. H. Fagg, “Vision- aptation of control policies for optimizing banking teller based three-ﬁnger grasp synthesis constrained by hand ge- process using machine learning,” IEEE Access, vol. 8, ometry,” Robotics and Autonomous Systems, vol. 54, no. 6, pp. 153403–153412, 2020. pp. 496–512, 2006. [25] H. Tout, N. Kara, C. Talhi, and A. Mourad, “Proactive ma- [11] P. Schmidt, N. Vahrenkamp, M. Wachter, ¨ and T. Asfour, chine learning-based solution for advanced manageability of “Grasping of unknown objects using deep convolutional multi-persona mobile computing,” Computers & Electrical neural networks based on depth images,” in Proceedings of the Engineering, vol. 80, Article ID 106497, 2019. 2018 IEEE International Conference on Robotics and Auto- [26] B. M. Haddad, S. F. Dodge, L. J. Karam, N. S. Patel, and mation (ICRA), pp. 6831–6838, IEEE, Brisbane, QLD, Aus- M. W. Braun, “Locally adaptive statistical background tralia, May 2018. modeling with deep learning-based false positive rejection for [12] G. Konidaris, S. Kuindersma, R. Grupen, and A. Barto, “Robot defect detection in semiconductor units,” IEEE Transactions learning from demonstration by constructing skill trees,” Ae on Semiconductor Manufacturing, vol. 33, no. 3, pp. 357–372, International Journal of Robotics Research, vol. 31, no. 3, 2020. pp. 360–375, 2012. [27] A. Almeshal and M. Alenezi, “A vision-based neural network [13] H. Abdellatef, M. Khalil-Hani, N. Shaikh-Husin, and controller for the autonomous landing of a quadrotor on S. O. Ayat, “Accurate and compact convolutional neural moving targets,” Robotics, vol. 7, no. 4, p. 71, 2018. network based on stochastic computing,” Neurocomputing, [28] A. Helwan, M. K. S. Ma’aitah, H. Hamdan, D. U. Ozsahin, and vol. 471, pp. 31–47, 2022. O. Tuncyurek, “Radiologists versus deep convolutional neural [14] A. Nader and D. Azar, “Searching for activation functions networks: a comparative study for diagnosing COVID-19,” using a self-adaptive evolutionary algorithm,” in Proceedings Computational and Mathematical Methods in Medicine, of the 2020 Genetic and Evolutionary Computation Conference vol. 2021, Article ID 5527271, 9 pages, 2021. Companion, pp. 145-146, Cancun, ´ Mexico, July 2020. [29] D. Shen, N. Huo, and S. S. Saab, “A probabilistically quantized [15] M. Orabi, J. Khalife, A. A. Abdallah, Z. M. Kassas, and learning control framework for networked linear systems,” S. S. Saab, “A machine learning approach for gps code phase IEEE Transactions on Neural Networks and Learning Systems, estimation in multipath environments,” in Proceedings of the vol. PP, 2021. 2020 IEEE/ION Position, Location and Navigation Symposium [30] J. A. Bagnell, F. Cavalcanti, L. Cui et al., “An integrated system (PLANS), pp. 1224–1229, IEEE, Portland, OR, USA, April for autonomous robotics manipulation,” in Proceedings of the 2020. 2012 IEEE/RSJ International Conference on Intelligent Robots [16] P. Saade, R. El Jammal, S. El Hayek, J. Abi Zeid, O. Falou, and and Systems, pp. 2955–2962, IEEE, Vilamoura-Algarve, D. Azar, “Computer-aided detection of white blood cells using Portugal, October 2012. geometric features and color,” in Proceedings of the 2018 9th [31] D. Morrison, P. Corke, and J. Leitner, “Closing the loop for Cairo International Biomedical Engineering Conference robotic grasping: a real-time, generative grasp synthesis ap- (CIBEC), pp. 142–145, IEEE, Cairo, Egypt, December 2018. proach,” 2018, https://arxiv.org/abs/1804.05172. [17] D. Azar, R. Moussa, and G. Jreij, “A comparative study of nine [32] R. K. Megalingam, R. V. Rohith Raj, T. Akhil, A. Masetti, machine learning techniques used for the prediction of dis- G. N. Chowdary, and V. S. Naick, “Integration of vision based eases,” International Journal of Artiﬁcial Intelligence, vol. 16, robot manipulation using ROS for assistive applications,” in no. 2, pp. 25–40, 2018. Proceedings of the 2020 Second International Conference on [18] S. Zhang, Y. Wang, J. Jiang, J. Dong, W. Yi, and W. Hou, Inventive Research in Computing Applications (ICIRCA), “CNN-based medical ultrasound image quality assessment,” pp. 163–169, Coimbatore, India, July 2020. Complexity, vol. 2021, Article ID 9938367, 9 pages, 2021. [33] M. Colledanchise and P. Ogren, Behavior Trees in Robotics [19] T. Borkar, F. Heide, and L. Karam, “Defending against uni- and AI: An Introduction, CRC Press, Boca Raton, FL, USA, versal attacks through selective feature regeneration,” in Proceedings of the 2020 IEEE/CVF Conference on Computer [34] Y. Hu, X. Wu, P. Geng, and Z. Li, “Evolution strategies Vision and Pattern Recognition (CVPR), pp. 709–719, June learning with variable impedance control for grasping under uncertainty,” IEEE Transactions on Industrial Electronics, [20] O. A. Wahab, J. Bentahar, H. Otrok, and A. Mourad, “Re- vol. 66, no. 10, pp. 7788–7799, 2018. source-aware detection and defense system against multi-type [35] A. Koubaa, ˆ Robot Operating System (ROS), Springer, Berlin, attacks in the cloud: repeated Bayesian Stackelberg game,” Germany, 2017. IEEE Transactions on Dependable and Secure Computing, [36] M. Quigley, K. Conley, B. Gerkey et al., “ROS: an open-source vol. 18, no. 2, pp. 605–622, 2019. robot operating system,” in Proceedings of the ICRA Workshop [21] O. A. Wahab, R. Cohen, J. Bentahar, H. Otrok, A. Mourad, on Open Source Software, Kobe, Japan, January 2009. and G. Rjoub, “An endorsement-based trust bootstrapping [37] H. Wei, Z. Shao, Z. Huang et al., “RT-ROS: a real-time ROS approach for newcomer cloud services,” Information Sciences, architecture on multi-core processors,” Future Generation vol. 527, pp. 159–175, 2020. Computer Systems, vol. 56, pp. 171–178, 2016. [22] M. Awada, F. J. Srour, and I. M. Srour, “Data-driven machine [38] P. Zhang, Advanced Industrial Control Technology, William learning approach to integrate ﬁeld submittals in project Andrew, Norwich, NY, USA, 2010. 14 Journal of Robotics [39] C. Acosta Lua, ´ C. C. Vaca Garc´ıa, S. Di Gennaro, B. Castillo- Electronics & Mobile Communication Conference (UEMCON), Toledo, and M. E. Sanchez ´ Morales, “Real-time hovering pp. 716–720, IEEE, New York, NY, USA, November 2018. [56] R. H. Jaafar and S. S. Saab, “Approximate diﬀerentiator with control of unmanned aerial vehicles,” Mathematical Problems varying bandwidth for control tracking applications,” IEEE in Engineering, vol. 2020, Article ID 2314356, 8 pages, 2020. Control Systems Letters, vol. 5, no. 5, pp. 1585–1590, 2020. [40] Y. Zeng, J. Sheng, and M. Li, “Adaptive real-time energy [57] S. S. Saab Jr., M. Hauser, A. Ray, and S. S. Saab, “Multivariable management strategy for plug-in hybrid electric vehicle based nonadaptive controller design,” IEEE Transactions on In- on simpliﬁed-ECMS and a novel driving pattern recognition dustrial Electronics, vol. 68, no. 7, pp. 6181–6191, 2020. method,” Mathematical Problems in Engineering, vol. 2018, [58] S. S. Saab and P. Ghanem, “A multivariable stochastic tracking Article ID 5816861, 12 pages, 2018. controller for robot manipulators without joint velocities,” [41] C. Eppner, S. Hofer, ¨ R. Jonschkowski et al., “Lessons from the IEEE Transactions on Automatic Control, vol. 63, no. 8, amazon picking challenge: four aspects of building robotic pp. 2481–2495, 2017. systems,” in Robotics: Science and Systems, 2016. [59] S. S. Saab, “An optimal stochastic multivariable PID con- [42] N. Correll, K. E. Bekris, D. Berenson et al., “Analysis and troller: a direct output tracking approach,” International observations from the ﬁrst amazon picking challenge,” IEEE Journal of Control, vol. 92, no. 3, pp. 623–641, 2019. Transactions on Automation Science and Engineering, vol. 15, [60] S. S. Saab, “Development of multivariable PID controller gains no. 1, pp. 172–188, 2016. in presence of measurement noise,” International Journal of [43] F. Chaumette, S. Hutchinson, and P. Corke, “Visual servoing,” Control, vol. 90, no. 12, pp. 2692–2710, 2017. in Springer Handbook of RoboticsSpringer, Berlin, Germany, [61] S. S. Saab, “A stochastic PID controller for a class of MIMO systems,” International Journal of Control, vol. 90, no. 3, [44] S. Hutchinson, G. D. Hager, and P. I. Corke, “A tutorial on pp. 447–462, 2017. visual servo control,” IEEE Transactions on Robotics and [62] S. S. Saab and R. H. Jaafar, “A proportional-derivative-double Automation, vol. 12, no. 5, pp. 651–670, 1996. derivative controller for robot manipulators,” International [45] G. Palmieri, M. Palpacelli, M. Battistelli, and M. Callegari, “A Journal of Control, vol. 94, no. 5, pp. 1273–1285, 2021. comparison between position-based and image-based dy- [63] R. R. Ardeshiri, M. H. Khooban, A. Noshadi, N. Vafamand, namic visual servoings in the control of a translating parallel and M. Rakhshan, “Robotic manipulator control based on an manipulator,” Journal of Robotics, vol. 2012, Article ID optimal fractional-order fuzzy PID approach: sil real-time 103954, 11 pages, 2012. simulation,” Soft Computing, vol. 24, no. 5, pp. 3849–3860, [46] J. Varley, J. Weisz, J. Weiss, and P. Allen, “Generating multi- ﬁngered robotic grasps via deep learning,” in Proceedings of [64] J. Till, C. E. Bryson, S. Chung, A. Orekhov, and D. C. Rucker, the 2015 IEEE/RSJ International Conference on Intelligent “Eﬃcient computation of multiple coupled cosserat rod Robots and Systems (IROS), pp. 4415–4420, IEEE, Hamburg, models for real-time simulation and control of parallel Germany, September 2015. continuum manipulators,” in Proceedings of the 2015 IEEE [47] B. Sauvet, F. Levesque, S. Park, P. Cardou, and C. Gosselin, International Conference on Robotics and Automation (ICRA), “Model-based grasping of unknown objects from a random pp. 5067–5074, IEEE, Seattle, WA, USA, May 2015. pile,” Robotics, vol. 8, no. 3, p. 79, 2019. [65] Y. Hu, Z. Li, G. Li, P. Yuan, C. Yang, and R. Song, “Devel- [48] S. Qiu, D. Lodder, and F. Du, “HGG-CNN: the generation of opment of sensory-motor fusion-based manipulation and the optimal robotic grasp pose based on vision,” Intelligent grasping control for a robotic hand-eye system,” IEEE Automation and Soft Computing, vol. 26, no. 6, pp. 1517–1529, Transactions on Systems, Man, and Cybernetics: Systems, 2020. vol. 47, no. 7, pp. 1169–1180, 2016. [49] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440, Boston, MA, USA, April 2015. [50] G. Dong and Z. H. Zhu, “Incremental inverse kinematics based vision servo for autonomous robotic capture of non- cooperative space debris,” Advances in Space Research, vol. 57, no. 7, pp. 1508–1514, 2016. [51] S. S. Saab and R. Toukhtarian, “A MIMO sampling-rate- dependent controller,” IEEE Transactions on Industrial Electronics, vol. 62, no. 6, pp. 3662–3671, 2014. [52] A. Loria, “Observers are unnecessary for output-feedback control of Lagrangian systems,” IEEE Transactions on Auto- matic Control, vol. 61, no. 4, pp. 905–920, 2015. [53] D. Shen and S. S. Saab, “Noisy output based direct learning tracking control with Markov Nonuniform trial lengths using adaptive gains,” IEEE Transactions on Automatic Control, [54] S. S. Saab, D. Shen, M. Orabi, D. Kors, and R. Jaafar, “Iterative learning control: practical implementation and automation,” IEEE Transactions on Industrial Electronics, vol. 69, 2021. [55] R. H. Jaafar and S. S. Saab, “Sliding mode control with dirty derivatives ﬁlter for rigid robot manipulators,” in Proceedings of the 2018 9th IEEE Annual Ubiquitous Computing,
Journal of Robotics – Hindawi Publishing Corporation
Published: Jun 30, 2022
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.